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ALIGNMENTS 



RESULT 1 
AAB72504 

ID AAB72504 standard; peptide; 15 AA. 
XX 

AC AAB72504; 
XX 

DT 09-MAY-2001 (first entry) 
XX 

DE Colostrinin peptide #5. 
XX 

KW Dermatological; oxidative stress regulator; colostrinin. 
XX 

OS Unidentified. 
XX 

PN WO200112650-A2 . 
XX 



PD 22-FEB-2001. 
XX 

PF 17-AUG-2000; 2000WO-US022665 . 
XX 

PR 17-AUG-1999; 99US- 014 9310P . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Stanton GJ, Hughes TK, Boldogh I; 
XX 

DR WPI; 2001-218342/22. 
XX 

PT Modulating oxidative stress level in a cell, involves contacting the cell 

PT with an oxidative stress regulator selected from colostrinin, its 

PT constituent peptide, analog or their combinations, 
XX 

PS Claim 6; Page 25; 48pp; English. 
XX 

CC The present invention relates to a method for modulating the oxidative 

CC stress level in a cell or a patient, comprising contacting the cell with, 

CC or administering to the patient, an oxidative stress regulator selected 

CC from colostrinin, or its constituent peptide (e.g. the present peptide), 

CC to change the level of an oxidising species in the cell. The method can 

CC be used to treat oxidative damage to skin, by decreasing or preventing an 

CC increase in the level of damage to a biomolecule of the patient 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 4; Length 15; 
Best Local Similarity 100.0%; Pred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I M I I I I I I I I I I I I 

Db 1 DLEMPVLPVEPFPFV 15 



RESULT 2 
AAB59322 

ID AAB59322 standard; peptide; 15 AA. 
XX 

AC AAB59322; 
XX 

DT 21-MAR-2001 (first entry) 
XX 

DE Ewe colostrinin peptide fragment B-7. 
XX 

KW Sheep; colostrinin; proline rich polypeptide; colostrum; immune disorder; 

KW central nervous system disorder; dietary supplement; beta-amyloid plaque. 
XX 

OS Ovis sp. 
XX 

PN WO200075173-A2 . 
XX 

PD 14-DEC-2000. 
XX 

PF 02-JUN-2000; 2 000WO-GB00212 8 . 



XX 

PR 02-JUN-1999; 99GB- 00012 852 . 
XX 

PA (REGE-) REGEN THERAPEUTICS PLC. 
XX 

PI Georgiades JA; 
XX 

DR WPI; 2001-071058/08. 
XX 

PT Peptides having an N-terminal amino acid sequence isolated from 

PT colostrinin for treating e.g. disorders of the central nervous system and 

PT immune system, viral and bacterial infections, and diseases characterized 

PT by amyloid plaques . 

XX 

PS Claim 7; Page 27; 63pp; English. 
XX 

CC The present invention provides the sequences of a number of peptides 

CC found in ewe's colostrinin. Colostrinin is the proline-rich polypeptide 

CC fragment of colostrum. These peptides can be used in the treatment of 

CC central nervous system disorders such as senile dementia, Parkinson's 

CC disease, 7y.zheimer's disease, psychosis and neurosis, immune system 

CC disorders such as bacterial and viral infections, to improve the 

CC development of a child's immune system, as a dietary supplement, and to 

CC promote the dissolution of beta-amyloid plaques 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 4; Length 15; 
Best Local Similarity 100.0%; Pred. No, l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 DLEMPVLPVEPFPFV 15 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 DLEMPVLPVEPFPFV 15 




RESULT 3 




AAB72250 




ID 


AAB72250 standard; peptide; 15 AA. 




XX 






AC 


AAB72250; 




XX 






DT 


14-MAY-2001 (first entry) 




XX 






DE 


Colostrinin derived cytokine inducing peptide 


SEQ ID 5. 


XX 






KW 


Colostrinin; immune response; cytokine; blood 


cell proliferation; 


KW 


central nervous system disorder; neurological 


diosrder ; mental disorder; 


KW 


dementia; neurodegenerative disease; Alzheimer 


's disease; psychosis; 


KW 


neurosis ; infection . 




XX 






OS 


Synthetic. 




XX 






PN 


WO200111937-A2. 




XX 






PD 


22-FEB-2001. 




XX 







PF 17-AUG-2000; 2000WO-US022818 . 
XX 

PR 17-AUG-1999; 99US-0149311P . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 

PA (REGE-) REGEN THERAPEUTICS PLC. 

XX 

PI Stanton GJ, Hughes TK, Boldogh 1, Georgiades J; 
XX 

DR WPI; 2001-202804/20. 
XX 

PT Inducing a cytokine and modulating an immune response, useful for 

PT treating central nervous system diseases and bacterial and viral 

PT infections, comprises administering colostrinin as an immunological 

PT regulator. 
XX 

PS Claim 1; Page 34; 50pp; English. 
XX 

CC Sequences AAB72246 - AAB72275 represent peptides derived from clostrinin, 

CC a proline rich polypeptide aggregate contained in colostrum. The peptides 

CC have immune response modulatory activity, and are capable of inducing 

CC cytokines . Colostrinin and its derived peptides are useful for inducing 

CC cytokine production, for modulating an immunological response and for 

CC inducing blood cell proliferation. The peptides are useful in the 

CC treatment of disorders of the central nervous system, neurological 

CC disorders, mental disorders, dementia, neurodegenerative diseases, 

CC Alzheimer's disease, motor neurone disease, psychosis, neurosis, chronic 

CC disorders of the immune system, bacterial and viral infections and 

CC acquired immunological deficiencies 

XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 4; Length 15; 
Best Local Similarity 100.0%; Fred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I I I I I I I I I I I I I 

Db 1 DLEMPVLPVEPFPFV 15 



RESULT 4 
AAB72536 

ID AAB72536 standard; peptide; 15 AA. 
XX 

AC AAB72536; 
XX 

DT 09-MAY-2001 (first entry) 
XX 

DE Colostrinin peptide #5. 
XX 

KW Neuroprotective; neural cell differentiation regulator; colostrinin; 

KW colostrum. 

XX 

OS Unidentified. 
XX 

FN WO200112651-A2 . 
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cc 


differentiation and treating damaged neural cells, using coles trinin and 


cc 


colostrinin constituent peptides (e.g. the present peptide) as a neural 


cc 


cell regulator. Colostrinin is a polypeptide complex found in colostrum 


XX 




SQ 


Sequence 15 AA; 



Query Match 100.0%; Score 81; DB 4; Length 15; 

Best Local Similarity 100.0%; Pred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 DLEMPVLPVEPFPFV 15 

I I I I I I M I I M I I I 
Db 1 DLEMPVLPVEPFPFV 15 



RESULT 5 
AA014581 

ID AA014581 standard; peptide; 15 AA. 
XX 

AC AA014581; 
XX 

DT 27-MAY-2002 (first entry) 
XX 

DE Neural cell regulatory colostrinin peptide 5. 
XX 

KW Neural cell differentiation; neural cell regulator; colostrinin peptide; 
KW neural cell formation; proline-rich polypeptide aggregate; colostrum; 
KW neural cell treatment. 
XX 

OS Unidentified. 
XX 

FH Key Location/Qualifiers 
FT Modif ied-site 15 

FT /note= "Optional C-terminal amide" 

XX 

PN WO200213851-A1. 
XX 



PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2000WO-US022777 . 
XX 

PR 17-AUG-2000; 2000WO-US022777 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Boldogh I, Stanton JG, Hughes TK; 
XX 

DR WPI; 2002-269152/31. 
XX 

PT Promoting cell differentiation in a patient involves use of blood cell 

PT regulator selected from colostrinin, its constituent peptide and/or 

PT analog. 
XX 

PS Claim 7; Page 21; 37pp; English. 
XX 

CC The invention comprises a method for promoting cell differentiation (e.g. 

CC neural cell differentiation) . The method involves contacting cells with a 

CC neural cell regulator (i.e. a colostrinin peptide) in order to change the 

CC cells in morphology to form neural cells. Colostrinin is a proline-rich 

CC polypeptide aggregate that is present in colostrum. The method of the 

CC invention is useful for promoting the differentiation of cells and for 

CC treating damaged neural cells in a patient. The present amino acid 

CC sequence represents a specifically claimed colostrinin peptide used in 

CC the method of the invention 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 5; Length 15; 
Best Local Similarity 100.0%; Pred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 DLEMPVLPVEPFPFV 15 




1 1 1 M 1 1 1 1 1 1 1 1 1 1 


Db 


1 DLEMPVLPVEPFPFV 15 


RESULT 6 


7^51040 


ID 


AAM51040 standard; peptide; 15 AA. 


XX 




AC 


AAM51040; 


XX 




DT 


30-MAY-2002 (first entry) 


XX 




DE 


Colostrinin constituent peptide. 


XX 




KW 


Colostrinin; colostrum; immunomodulator ; cardiovascular; 


KW 


blood cell regulator; cytokine inducer; beta-casein; human 


XX 




OS 


Homo sapiens . 


XX 




FH 


Key Location/Qualif iers 


FT 


Modif ied-site 15 


FT 


/note= "optional C-terminal amidation" 



XX 

PN WO200213849-A1. 
XX 

PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2 000WO-US02277 5 . 
XX 

PR 17-AUG-2000; 2 000WO-US02277 5 . 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 

PA (REGE-) REGEN THERAPEUTICS PLC. 

XX 

PI Stanton GJ, Hughes TK^. Boldogh I, Georgiades J; 
XX 

DR WPI; 2002-269150/31. 
XX 

PT Modulation of blood cell proliferation in a patient involves use of blood 

PT cell regulator selected from colostrinin, its constituent peptide and/or 

PT analog. 
XX 

PS Claim 1; Page 34; 54pp; English. 
XX 

CC The present sequence is that of a colostrinin constituent peptide that is 

CC preferred for use as an immunological regulator and as a blood cell 

CC regulator in claimed methods of the invention. It is classified as having 

CC a beta-casein homologue precursor. Methods are claimed for: inducing a 

CC cytokine in a cell by contact with an immunological regulator, where the 

CC cell is present in a cell culture, a tissue, an organ or an organism, and 

CC the cell is mammalian, including human; modulating an immune response in 

CC a cell by contact with the immunological regulator under conditions 

CC effective to induce a cytokine; modulating an immune response in a 

CC patient by administering an immunological regulator under conditions 

CC effective to induce a cytokine, where the immunological regulator is 

CC administered topically or as part of a dietary supplement, and where the 

CC immune response is specific or non specific, an interferon response or an 

CC antibody response; modulating blood cell proliferation by contacting 

CC blood cells with a blood cell regulator, where the blood cells are 

CC present in a cell culture or an organism, are mammalian or human, and 

CC where the blood cells are increased in number or differentiated; and a 

CC method for modulating blood cell proliferation in a patent. A claimed 

CC cytokine-inducing composition comprises a pharmaceutical carrier and an 

CC active agent such as the present peptide. Cytokines induced by this 

CC peptide in human leucocyte cultures include interf eron-gamma, tumour 

CC necrosis factor-alpha, interleukin-6 and interleukin-10 

XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 5; Length 15; 
Best Local Similarity 100.0%; Pred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I I I I I I I I I I I M 

Db 1 DLEMPVLPVEPFPFV 15 



RESULT 7 



AAE20232 

ID AAE20232 standard; peptide; 15 AA, 
XX 

AC AAE2 0232; 
XX 

DT 18-JUN-2002 (first entry) 
XX 

DE Colostrinin constituent peptide #5. 
XX 

?CW Blood cell regulator; colostrinin; constituent peptide; oxidative stress; 

KW therapy; oxidative damage; skin; aging; wound healing; cell replacement; 

KW tissue; organ; cosmetic procedure; repair; regeneration; preservation; 

KW transplantation; implantation; dermatological ; vulnerary. 
XX 

OS Unidentified. 
XX 

FH Key Location/Qualif iers 

FT Modified-site 15 

FT /note= "Optionally C-terminal amide" 
XX 

PN WO200213850-A1. 
XX 

PD 21-FEB-2002. 
XX 

PF 17-AUG-2000; 2000WO-US022776 . 
XX 

PR 17-AUG-2000; 2000WO-US022776 , 
XX 

PA (TEXA ) UNIV TEXAS SYSTEM. 
XX 

PI Stanton GJ, Hughes TK, Boldogh I; 
XX 

DR WPI; 2002-269151/31. 
XX 

PT Composition useful for the modulation of blood cell proliferation in a 

PT patient comprises a blood cell regulator selected from colostrinin, its 

PT constituent peptide and/or analog. 
XX 

PS Claim 6; Page 25; 51pp; English. 
XX 

CC The invention relates to a composition which comprises a blood cell 

CC regulator selected from colostrinin, its constituent peptide and/or 

CC analogue. The invention is used for modulating the oxidative stress level 

CC in a cell e.g. mammalian or human cell present in a cell culture, tissue, 

CC organ, or organism; or for treating oxidative damage to the skin of a 

CC patient e.g. animal or human; to modulate oxidative stress during/ after 

CC a premature birth or normal birth, preventing/delaying aging in a 

CC patient, enhancing wound healing, and the reduction of side effects of 

CC cosmetic procedures. The method changes the level of an oxidising species 

CC in the cell, such as decreases or prevents increase in the level of 

CC damage to a biomolecule of the patient selected from DNA, protein and/or 

CC lipid, compared to the same conditions when the oxidative stress 

CC regulator is not present. The modulation of oxidative stress results in 

CC enhanced repair, regeneration, and replacement of cells, tissues and 

CC organs (e.g. kidney, liver, pancreas, skin, and the other internal and 

CC external organs), as well as enhanced preservation of such organs for 

CC transplantation, implantation, or scientific research. The present 



CC sequence is a colostrinin constituent peptide 
XX 

SQ Sequence 15 AA; 

Query Match 100.0%; Score 81; DB 5; Length 15; 

Best Local Similarity 100.0%; Pred. No. l.le-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

M I I I I I I I I I M I I 
Db 1 DLEMPVLPVEPFPFV 15 
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PA 


(REGE-) REGEN THERAPEUTICS PLC. 


XX 




PI 


Georgiades JA; 


XX 




DR 


WPI; 2001-071058/08. 


XX 




PT 


Peptides having an N-terminal amino acid sequence isolated from 


PT 


colostrinin for treating e.g. disorders of the central nervous system and 


PT 


immune system, viral and bacterial infections, and diseases characterized 


PT 


by amyloid plaques. 


XX 




PS 


Claim 8; Page 27; 63pp; English. 


XX 




CC 


The present invention provides the sequences of a number of peptides 


CC 


found in ewe's colostrinin. Colostrinin is the proline-rich polypeptide 


CC 


fragment of colostrum. These peptides can be used in the treatment of 


CC 


central nervous system disorders such as senile dementia, Parkinson's 


CC 


disease, Alzheimer's disease, psychosis and neurosis, immune system 


CC 


disorders such as bacterial and viral infections, to improve the 


CC 


development of a child's immune system, as a dietary supplement, and to 


CC 


promote the dissolution of beta-amyloid plaques 



XX 

SQ Sequence 16 AA; 



Query Match 100.0%; Score 81; DB 4; Length 16; 

Best Local Similarity 100.0%; Pred. No. 1.2e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I I I I I I M I I I I I 
Db 2 DLEMPVLPVEPFPFV 16 



RESULT 9 
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?fi-.TAN-9nni • 9 noi wo-nROon'^? 9 

D U r\ri Z.UUX/ ^UVXv¥\^ O JJ \J \J \J -J J • 


W 
A. A. 




PR 


26-JAN-2000 ; 2 000GB~00001825 . 


XX 




PA 


(REGE-) REGEN THERAPEUTICS PLC. 


XX 




PI 


Georgiades JA; 


XX 




DR 


WPI; 2001-488775/53. 


XX 




PT 


Peptide useful as an interalia in the treatment of e.g. disorders of the 


PT 


immune system and the central nervous system comprises ten amino-terminal 


PT 


amino acid sequence derived from peptides present in colostrinin. 


XX 




PS 


Claim 1; Page 15; 40pp; English. 


XX 




CC 


The invention relates to colostrinin peptide fragments which are useful. 


CC 


inter alia, in the treatment of chronic disorders of the immune system 


CC 


and the central nervous system. Colostrinin peptides are used as a 


CC 


medicament in the treatment of neurological disorders e . g dementia, 


CC 


neurodegenerative disorders e.g., Alzheimer's disease, motor neuron 


CC 


disease e.g., Parkinson's disease, mental disorders e.g. psychosis and 



CC neurosis, in acquired immunological deficiencies, chronic bacterial and 

CC viral infections and diseases characterised by the presence of beta- 

CC amyloid plaques and as a dietary supplement for babies, small children, 

CC adults and senile persons, who have been subjected to chemotherapy or 

CC have suffered from cachexia or weight loss due to the chronic disease. 

CC Colostrinin peptides are also used as food additives and as an auxiliary 

CC withdrawal treatment for drug addicts, after a period of detoxification 

CC and in persons dependent on stimulants. Colostrinin peptides are used to 

CC prepare antibodies and to treat emotional disturbances, e.g. emotional 

CC disturbances of psychiatric patients in a state of depression. These 

CC colostrinin peptides improves the development of immune system in a new 

CC born child and to correct the immunological deficiencies in a child. The 

CC present sequence is colostrinin peptide 3 related to the invention 
XX 

SQ Sequence 10 AA; 

Query Match 70.4%; Score 57; DB 4; Length 10; 

Best Local Similarity 100.0%; Pred. No. 0.031; 

Matches 10; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

Qy 5 PVLPVEPFPF 14 

I I I I I I I I M 
Db 1 PVLPVEPFPF 10 



RESULT 10 
AAE07197 

ID AAE07197 standard; peptide; 10 AA. 
XX 

AC AAE07197; 
XX 

DT 06-NOV-2001 (first entry) 
XX 

DE Modified colostrinin cyclic peptide #3. 
XX 

KW Colostrinin; nootropic; neuroprotective; immunomodulatory; antibacterial; 
KW Parkinson's disease; TVlzheimer's disease; mental disorder; food additive; 
KW central nervous system disorder; neurodegenerative disorder; weight loss; 
KW beta-amyloid plaque; psychosis; neurosis; cachexia; motor neuron disease; 
KW acquired immunological deficiency; neurological disorder; dementia; 
KW antiviral; cyclic. 
XX 

OS Synthetic. 
XX 

FH Key Location/Qualif iers 

FT Modif ied-site 1 

FT /note= "N-terminal acetyl; this residue forms a cyclic 

FT linkage with Pro found at the C-terminal end" 

XX 

PN WO200155199-A1. 
XX 

PD 02-AUG-2001. 
XX 

PF 26-JAN-2001; 2001WO-GB000329 . 
XX 

PR 26-JAN-2000; 2000GB-00001825 . 
XX 



PA (REGE-) REGEN THERAPEUTICS PLC. 
XX 

PI Georgiades JA; 
XX 

DR WPI; 2001-488775/53. 
XX 

PT Peptide useful as an interalia in the treatment of e.g. disorders of the 

PT immune system and the central nervous system comprises ten amino-terminal 

PT amino acid sequence derived from peptides present in colostrinin. 
XX 

PS Example 2; Page 8; 40pp; English. 
XX 

CC The invention relates to colostrinin peptide fragments which are useful, 

CC inter alia, in the treatment of chronic disorders of the immune system 

CC and the central nervous system. Colostrinin peptides are used as a 

CC medicament in the treatment of neurological disorders e.g., dementia, 

CC neurodegenerative disorders e.g., 7Vlzheimer*s disease, motor neuron 

CC disease e.g., Parkinson's disease, mental disorders e.g. psychosis and 

CC neurosis, in acquired immunological deficiencies, chronic bacterial and 

CC viral infections and diseases characterised by the presence of beta- 

CC amyloid plaques and as a dietary supplement for babies, small children, 

CC adults and senile persons, who have been subjected to chemotherapy or 

CC have suffered from cachexia or weight loss due to the chronic disease. 

CC Colostrinin peptides are also used as food additives and as an auxiliary 

CC withdrawal treatment for drug addicts, after a period of detoxification 

CC and in persons dependent on stimulants. Colostrinin peptides are used to 

CC prepare antibodies and to treat emotional disturbances, e.g. emotional 

CC disturbances of psychiatric patients in a state of depression. These 

CC colostrinin peptides improves the development of immune system in a new 

CC born child and to correct the immunological deficiencies in a child. The 

CC present sequence is modified colostrinin cyclic peptide #3 related to the 

CC invention 
XX 

SQ Sequence 10 AA; 



Query Match 63.0%; Score 51; DB 4; Length 10; 

Best Local Similarity 100.0%; Pred. No. 0.26; 

Matches 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 5 PVLPVEPFP 13 

I I I I I I I I I 
Db 2 PVLPVEPFP 10 



RESULT 11 
ADB64957 

ID ADB64957 standard; protein; 232 AA. 
XX 

AC ADB64957; 
XX 

DT 04-DEC-2003 (first entry) 
XX 

DE Human protein encoded by clone PROST20054 660 . 
XX 

KW Human; pharmaceutical; diagnostic; gene therapy; tissue regeneration; 

KW cell regeneration; membrane protein; signal transduction-related protein; 

KW transcription-related protein; osteoporosis; neurological disease; 



KW cancer; tumour. 
XX 

OS Homo sapiens. 
XX 

PN EP1308459-A2. 
XX 

PD 07-MAY-2003. 
XX 

PF 28-MAR-2002; 2002EP-00007401 . 
XX 

PR 05-NOV-2001; 2001JP-003792 98 . 

PR 25-JAN-2002; 2 002US-0035097 8 . 
XX 

PA (HELI-) HELIX RES INST. 

PA (REAS-) RES ASSOC BIOTECHNOLOGY. 

XX 

PI Isogai T, Sugiyama T, Otsuki T, Wakamatsu A, Sato H, Ishii S; 

PI Yamamoto J, Isono Y, Hio Y, Otsuka Nagai K, Irie R, Tamechika I; 

PI Seki N, Yoshikawa T, Otsuka M, Nagahari K, Masuho Y; 

XX 

DR WPI; 2003-450961/43. 

DR N-PSDB; ADB62987. 
XX 

PT New polynucleotides and polypeptides, useful for developing a diagnostic 

PT marker or medicines for regulation of their expression and activity, or 

PT as targets of gene therapy. 
XX 

PS Claim 1; Page; 222pp; English. 
XX 

CO The invention discloses a polynucleotide comprising a sequence selected 

CC from 1970 fully defined nucleotide sequences which encode novel 

CC polypeptides. Also claimed is a polypeptide encoded by the polynucleotide 

CC or its partial peptide, an antibody binding to the polypeptide or peptide 

CC of the polynucleotide, immunologically assaying the polypeptide or 

CC peptide of the polynucleotide by contacting the polypeptide or peptide 

CC with the antibody of the encoded protein, and observing the binding 

CC between the two, a trans formant carrying the polynucleotide in an 

CC expressible manner and an antisense polynucleotide. The oligonucleotide 

CC is useful as a primer for synthesising the polynucleotide, or as a probe 

CC for detecting the polynucleotide. The polynucleotides and encoded 

CC proteins are useful as pharmaceutical agents and many disease-related 

CC genes may be included in them, for developing a diagnostic marker or 

CC medicines for regulation of their expression and activity, or as targets 

CC of gene therapy. The genes are involved in tissue and/or cell 

CC regeneration. Membrane proteins, signal transduction-related proteins, 

CC transcription-related proteins, disease-related proteins and genes 

CC encoding them can be used as indicators for diseases (e.g. osteoporosis, 

CC neurological diseases, cancer, tumours. The cDNA may be used to regulate 

CC the activity or expression of the encoded protein to treat diseases. The 

CC sequence presented is a protein of the invention. Note: Some of the 

CC sequence data for this patent is not represented in the printed 

CC specification, but is based on sequence information supplied by the 

CC European Patent Office. 

XX 

SQ Sequence 232 AA; 



Query Match 



59.3%; Score 48; DB 7 ; Length 232; 



Best Local Similarity 66.7%; Pred. No. 21; 

Matches 8; Conservative 2; Mismatches 2; Indels 0; Gaps 



Qy 3 EMPVLPVEPFPF 14 

: I I I I I I : I I 
Db 2 09 KFPVLPVHPWPF 220 



RESULT 12 
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AATTffin'^Rfi 'Standard: nrotein; 55 AA. 
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PR 


02-JUN-2000; 2000US-0208841P . 
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Ta/PT • 9nni-filf^7 74/71 

WirX/^UUXvJXU//ri//-L. 


FlR 
Ur\ 


vT-pqriR' AAS59609 

1. tJ LJ ID f /^iXkJ — ' vJ w ^ • 


XX 




PT 


Propionibacterium acnes polypeptides and nucleic acids useful for 


PT 


vaccinating against and diagnosing infections, especially useful for 


PT 


treating acne vulgaris. 


XX 




PS 


Example 1; SEQ ID NO 21581; 1069pp; English. 


XX 




CC 


Sequences AAU39105-AAU68017 represent Propionibacterium acnes immunogenic 


CC 


polypeptides. The proteins and their associated DNA sequences are used in 


CC 


the treatment, prevention and diagnosis of medical conditions caused by 


CC 


P. acnes. The disorders include SAPHO syndrome (synovitis, acne. 


CC 


pustulosis, hypertosis and osteomyelitis), uveitis and endophthalmitis. 


CC 


P. acnes is also involved in infections of bone, joints and the central 


CC 


nervous system, however it is particularly involved in the inflammatory 


CC 


lesions associated with acne vulgaris. A method for detecting the 



cc presence or absence of P. acnes in a patient comprises contacting a 

CG sample with a binding agent that binds to the proteins of the invention 

CC and determining the amount of bound protein in the sample. The 

CC polypeptides may be used as antigens in the production of antibodies 

CC specific for P. acnes proteins. These antibodies can be used to 

CC downregulate expression and activity of P. acnes polypeptides and 

CC therefore treat P. acnes infections. The antibodies may also be used as 

CC diagnostic agents for determining P. acnes presence, for example, by 

CC enzyme linked immunosorbent assay (ELISA) . Note: The sequence data for 

CC this patent did not form part of the printed specification, but was 

CC obtained in electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 55 AA; 

Query Match 55.6%; Score 45; DB 4 ; Length 55; 

Best Local Similarity 63.6%; Pred. No. 13; 

Matches 7; Conservative 2; Mismatches 2; Indels 0; Gaps 

Qy 4 MPVLPVEPFPF 14 

: I I I I III: 
Db 23 LPVLPQSPFPY 33 



RESULT 13 


ABM56905 


ID 


ABM56905 standard; protein; 55 AA. 


XX 




AC 


ABM56905; 


XX 




DT 


20-OCT-2003 (first entry) 


XX 




DE 


Propionibacterium acnes predicted ORF-encoded polypeptide #21581. 


XX 




KW 


Acne vulgaris; antiseborrhoeic; dermatological ; antibacterial; 


KW 


immunostimulant ; immune response; vaccine. 


XX 




OS 


Propionibacterium acnes. 


XX 




PN 


WO2003033515-A1. 


XX 




PD 


24-APR-2003. 


XX 




PF 


ll-OCT-2002; 2002WO-US032727 . 


XX 




PR 


15-OCT-2001; 2001US-00978825 . 


XX 




PA 


(CORI-) CORIXA CORP. 


XX 




PI 


Mitcham JL, Skeiky YAW, Persing DH, Bhatia A, Maisonneuve JL; 


PI 


Zhang Y, Wang S, Jen S, Lodes MJ, Benson DR, Jones R, Carter D; 


PI 


Barth B, Vallieve-Douglass J; 


XX 




DR 


WPI; 2003-381789/36. 


DR 


N-PSDB; ACF64538. 


XX 




PT 


New Propionibacterium acnes polypeptides and polynucleotides encoding the 



PT polypeptide, useful for diagnosing, preventing or treating acne vulgaris, 

PT or for stimulating an immune response specific for a P. acnes protein. 
XX 

PS Example 1; SEQ ID NO 21581; 1481pp; English. 
XX 

CC The invention relates to an isolated polynucleotide (ACF64 435-ACF64 733 ) 

CC encoding a Propionibacterium acnes protein. The invention also relates to 

CC polypeptides encoded by the polynucleotides (ABM35624-ABM64536) and to 

CC immunogenic fragments of P. acnes polypeptides. The invention 

CC additionally encompasses expression vectors and host cells comprising a 

CC polynucleotide of the invention; antibodies against polypeptides of the 

CC invention; fusion proteins comprising a polypeptide of the invention; a 

CC method for stimulating an immune response specific for a P. acnes 

CC polypeptide and an isolated T cell population comprising T cells prepared 

CC via this method; a vaccine composition (comprising P. acnes polypeptides, 

CC polynucleotides, antibodies, fusion proteins, T cell populations, or 

CC antigen-presenting cells that express the polypeptide) ; a method and kit 

CC for detecting or determining the presence or absence of P. acnes in a 

CC patient; and a method for inhibiting the development of P. acnes in a 

CC patient. The P. acnes polypeptides, polynucleotides, antibodies, fusion 

CC proteins, T cell populations or antigen-presenting cells that express the 

CC polypeptides are useful for diagnosing, preventing or treating acne 

CC vulgaris, or for stimulating an immune response specific for a P. acnes 

CC protein. The polynucleotides can also be used as probes or primers for 

CC nucleic acid hybridisation. The vaccine composition is useful for the 

CC stimulation of an immune response against P. acnes, or for treating acne, 

CC and the kit is useful for performing a diagnostic assay. The present 

CC sequence represents a polypeptide predicted to be encoded by an ORE (open 

CC reading frame) contained within the P. acnes polynucleotides of the 

CC invention. Note: The sequence data for this patent did not form part of 

CC the printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 55 AA; 



Query Match 55.6%; Score 45; DB 6; Length 55; 

Best Local Similarity 63.6%; Pred. No. 13; 

Matches 7; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 4 MPVLPVEPEPE 14 

: I I M III: 
Db 23 LPVLPQSPFPY 33 



RESULT 14 
AAG11334 

ID AAG11334 standard; protein; 89 AA. 
XX 

AC 7^011334; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Arabidopsis thaliana protein fragment SEQ ID NO: 10012. 
XX 

KW Protein identification; signal transduction pathway; metabolic pathway; 

KW hybridisation assay; genetic mapping; gene expression control; promoter; 

KW termination sequence. 



OS Arabidopsis thaliana. 
XX 

PN EP1033405-A2. 
XX 
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y yyj >D 


ni 4 f^^ft ftp 


PR 


09 


JrsXi O 


1 QQQ 

A. Z? Z? Z? i 


QQTTc:- 


ni 4 fi^ft QP 


PR 


03 


-AUG- 


1999 


99US- 


01 47 O^ftP 


PR 


04 




1 QQQ 

X -7 -7 _7 , 


99US- 


01 47904P 


PR 


04 


-AUG- 


1 QQ Q 

JLZ? Z? Z? t 


99US- 


01 47'^09P 


PR 


05 


-AUG- 


1999, 


99US- 


0147 192P 


PR 


05 


-AUG- 


1999 


99US- 


01 47960P 


PR 


06 


-AUG- 


1999 


99US- 


01 47 ^O^P 


PR 


06 


-AUG- 


1999 , 


99US- 


0147416P 


PR 


09 


-AUG- 


1999, 


; 99US- 


0147493P 


PR 


09 


-AUG- 


1999 , 


; 99US- 


014 7935P 


PR 


10 


-AUG- 


1999 , 


; 99US- 


0148 171P 


PR 
c is 


X X 




1 QQQ 


QQTTQ- 
r y y \j tD 


01 4 ft "^1 QP 
yjj.'-xKj-jj.ytz 


PR 


12 


-AUG- 


1999, 


99US- 


0148341P 


PR 


13 


-AUG- 


1999, 


; 99US- 


0148565P 


PR 


13 


-AUG- 


1999, 


; 99US- 


0148684P 


PR 


16 


-AUG- 


1999, 


; 99US- 


0149368P 


PR 


17 


-AUG- 


1999, 


; 99US- 


0149175P 



PR 18-AUG-1999; 99US-014 9426P 

PR 20-AUG-1999; 99US-0149722P 

PR 20-AUG-1999; 99US-0149723P 

PR 20-AUG-1999; 99US-014992 9P 

PR 23-AUG-1999; 99US-0149902P 

PR 23-AUG-1999; 99US-014 9930P 

PR 25-AUG-1999; 99US-0150566P 

PR 26-AUG-1999; 99US-01508 84P 

PR 27-AUG-1999; 99US-0 1510 65P 

PR 27-AUG-1999; 99US-0151066P 

PR 27-AUG-1999; 99US-0151080P 

PR 30-AUG-1999; 99US-0151303P 

PR 31-AUG-1999; 99US-0151438P 

PR Ol-SEP-1999; 99US-0151930P 

PR 07-SEP-1999; 99US-0152363P 

PR lO-SEP-1999; 99US-0 15307 OP 

PR 13-SEP-1999; 99US-0153758P 

PR 15-SEP-1999; 99US-0154 018P 

PR 16-SEP-1999; 99US-0154 039P 

PR 20-SEP-1999; 99US-0154779P 

PR 22-SEP-1999; 99US-0155139P 

PR 23-SEP-1999; 99US-015548 6P 

PR 24-SEP-1999; 99US-0155659P 

PR 28-SEP-1999; 99US-01564 58P 

PR 29-SEP-1999; 99US-0156596P 

PR 04-OCT-1999; 99US-0157 117P 

PR 05-OCT-1999; 99US-0 157 7 53P . 

PR 06-OCT-1999; 99US-0 157 8 65P . 

PR 07-OCT-1999; 99US-015802 9P . 

PR 08-OCT-1999; 99US-0158232P . 

PR 12-OCT-1999; 99US-0158369P . 

PR 13-OCT-1999; 99US-01592 93P . 

PR 13-OCT-1999; 99US-01592 94P , 

PR 13-OCT-1999; 99US-01592 95P . 

PR 14-OCT-1999; 99US-0159329P . 

PR 14-OCT-1999; 99US-0159330P . 

PR 14-OCT-1999; 99US-0 159331P . 

PR 14-OCT-1999; 99US-0159637P . 

PR 14-OCT-1999; 99US-0159638P . 

PR 18-OCT-1999; 99US-0159584P . 

PR 21-OCT-1999; 99US-016074 IP . 

PR 21-OCT-1999; 99US-01607 67P . 

PR 21-OCT-1999; 99US-01607 68P . 

PR 21-OCT-1999; 99US-0 16077 OP . 

PR 21-OCT-1999; 99US-0160814P . 

PR 21-OCT-1999; 99US-0 1608 15P . 

PR 22-OCT-1999; 99US-0 16098 OP . 

PR 22-OCT-1999; 99US-0160981P . 

PR 22-OCT-1999; 99US-0160989P . 

PR 25-OCT-1999; 9 9US-0 1 614 04P . 

PR 25-OCT-1999; 99US-01614 05P . 

PR 25-OCT-1999; 99US-01614 06P , 

PR 26-OCT-1999; 9 9US-01 61359P . 

PR 26-OCT-1999; 99US-0161360P . 

PR 26-OCT-1999; 99US-0161361P . 

PR 28-OCT-1999; 99US-0161920P . 

PR 28-OCT-1999; 99US-0161992P . 



PR 28-OCT-1999; 99US-0161993P . 
PR 29-OCT-1999; 99US- 0 1 62 142 P . 

Query Match 55.6%; Score 45; DB 3; Length 91; 

Best Local Similarity 77.8%; Pred. No. 22; 

Matches 7; Conservative 1; Mismatches 1; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFP 13 

I I : I I I I I 
Db 4 9 PVIPTEPFP 57 



Search completed: August 24, 2004, 15:42:28 
Job time : 65.1194 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: August 24, 2004, 15:33:13 ; Search time 16.4552 Seconds 

(without alignments ) 
47.060 Million cell updates/sec 

Title: US-09-64 1-8 01-5 

Perfect score: 81 

Sequence: 1 DLEMPVLPVEPFPFV 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 389414 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/2/iaa/5A_COMB.pep : * 

2 : /cgn2__6/ptodata/2/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB . pep : * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep: * 

6 : / cgn2_6/ptodata/2/iaa/backf ilesl . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMM7VRIES 



Result Query 

No. Score Match Length DB ID Description 



1 


81 


100 


.0 


15 


4 


US- 


09- 


641- 


803-5 


Sequence 


5, Appli 


2 


45 


55 


6 


803 


4 


US- 


09- 


252- 


991A-30479 


Sequence 


30479, A 


3 


44 


54 


3 


259 


4 


us- 


09- 


328- 


352-4873 


Sequence 


4873, Ap 


4 


42 


51 


9 


242 


4 


us- 


09- 


252- 


991A-28361 


Sequence 


28361, A 


5 


41.5 


51 


2 


86 


4 


US- 


09- 


461- 


325-456 


Sequence 


456, App 


6 


41.5 


51 


2 


86 


4 


US- 


10- 


012- 


542-456 


Sequence 


456, App 


7 


40 


49 


4 


143 


4 


us- 


09- 


621- 


976-5226 


Sequence 


5226, Ap 


8 


40 


49 


4 


219 


4 


us- 


09- 


527- 


345-2 


Sequence 


2, Appli 


9 


40 


49 


4 


577 


4 


us- 


09- 


489- 


039A-9575 


Sequence 


9575, Ap 


10 


39 


48 


1 


220 


4 


us- 


09- 


198- 


452A-211 


Sequence 


211, App 


11 


39 


48. 


1 


373 


4 


us- 


09- 


149- 


476-374 


Sequence 


374, App 



12 


39 


48 


. 1 


405 


3 


US-08-888-429A-22 


Sequence 


22, Appl 


13 


39 


48 


.1 


405 


4 


US-09-593-653-22 


Sequence 


22, Appl 


14 


39 


48 


.1 


444 


4 


US-09-252-991A-20775 


Sequence 


20775, A 


15 


39 


48 


.1 


526 


1 


US-07-921-796-6 


Sequence 


6, Appli 


16 


39 


48 


. 1 


526 


1 


US-07-921-796-8 


Sequence 


8, Appli 


17 


39 


48 


.1 


540 


4 


US-08-945-771-2 


Sequence 


2, Appli 


18 


38 


46 


. 9 


110 


4 


US-09-543-681A-5498 


Sequence 


5498, Ap 


19 


38 


46 


.9 


455 


4 


US-09-4 89-039A-9964 


Sequence 


9964, Ap 


20 


38 


46 


.9 


477 


3 


US-08-704-711A-20 


Sequence 


20, Appl 


21 


38 


46 


.9 


477 


3 


US-08-448-489-15 


Sequence 


15, Appl 


22 


38 


46 


.9 


477 


3 


US-08-281-313-1 


Sequence 


9, Appli 


23 


38 


46 


. 9 


477 


4 


US-09-521-220-20 


Sequence 


20, Appl 


24 


38 


46 


. 9 


477 


4 


US-09-391-104-21 


Sequence 


21, Appl 


25 


38 


46 


.9 


486 


4 


US-08-259-451-13 


Sequence 


13, Appl 


26 


38 


46 


9 


548 


4 


US-09-252-9 91A-2 8958 


Sequence 


28958, A 


27 


38 


46 


9 


655 


4 


US-0 9-252-9 91A-25314 


Sequence 


25314, A 


28 


38 


46 


9 


820 


4 


US- 09-2 52-9 91A-32 001 


Sequence 


32001, A 


29 


38 


46 


9 


937 


4 


US-0 9-252-9 91A- 19108 


Sequence 


19108, A 


30 


38 


46. 


9 


959 


4 


US-09-543-681A-6879 


Sequence 


6879, Ap 


31 


37 


45, 


7 


367 


1 


US-07-864-004B-2 


Sequence 


2, Appli 


32 


37 


45. 


7 


367 


1 


US-08-251-937A-2 


Sequence 


2, Appli 


33 


37 


45, 


7 


367 


4 


US-09-198-452A-1069 


Sequence 


1069, Ap 


34 


37 


45. 


7 


367 


5 


PCT-US93-03275-2 


Sequence 


2, Appli 


35 


37 


45. 


7 


368 


1 


US-08-212-133A-6 


Sequence 


6, Appli 


36 


37 


45. 


7 


368 


1 


US-08-474-503-4 


Sequence 


4, Appli 


37 


37 


45. 


7 


368 


2 


US-08-670-707A-4 


Sequence 


4, Appli 


38 


37 


45. 


7 


368 


3 


US-09-037-601-4 


Sequence 


4, Appli 


39 


37 


45. 


7 


368 


4 


US-09-315-179-4 


Sequence 


4, Appli 


40 


37 


45. 


7 


368 


4 


US-09-523-656-4 


Sequence 


4, Appli 


41 


37 


45. 


7 


368 


5 


PCT-US94-13200-4 


Sequence 


4, Appli 


42 


37 


45. 


7 


372 


4 


US-09-2 52-991A-22962 


Sequence 


22962, A 


43 


37 


45. 


7 


429 


4 


US-09-328-352-4875 


Sequence 


4875, Ap 


44 


37 


45. 


7 


482 


4 


US-09-4 8 9-039A-7931 


Sequence 


7931, Ap 


45 


37 


45. 


7 


486 


1 


US-07-672-483-2 


Sequence 


2, Appli 



ALIGNMENTS 



RESULT 1 
US-09-641-803-5 

; Sequence 5, Application US/09641803 

; Patent No. 6500798 

; GENERAL INFORMATION: 

; APPLICANT: STANTON, G. John 

; APPLICANT: HUGHES, Thomas K. 

; APPLICANT: BOLDOGH, Istvan 

; TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 

; TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 

; FILE REFERENCE: 265.00220101 

; CURRENT APPLICATION NUMBER: US/09/641, 803 

; CURRENT FILING DATE: 2000-08-17 

; PRIOR APPLICATION NUMBER: 60/149,310 

PRIOR FILING DATE: 1999-08-17 
; NUMBER OF SEQ ID NOS : 34 
; SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 5 



LENGTH: 15 
; TYPE: PRT 

; ORGANISM: Artificial Sequence 
FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence: synthetic 

OTHER INFORMATION: peptide 
US-09-641-803-5 

Query Match 100.0%; Score 81; DB 4; Length 15; 

Best Local Similarity 100.0%; Fred. No. 2.8e-06; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 DLEMPVLPVEPFPFV 15 

I I M I I I I I I I I I I I 
Db 1 DLEMPVLPVEPFPFV 15 



RESULT 2 

US-09-252-991A-30479 

; Sequence 30479, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

CURRENT FILING DATE: 1999-02-18 
; PRIOR APPLICATION NUMBER: US 60/074,788 

PRIOR FILING DATE: 1998-02-18 

PRIOR APPLICATION NUMBER: US 60/094,190 
; PRIOR FILING DATE: 1998-07-27 
; NUMBER OF SEQ ID NOS : 33142 
; SEQ ID NO 30479 
LENGTH: 8 03 
TYPE: PRT 
; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-3 047 9 

Query Match 55.6%; Score 45; DB 4; Length 803; 

Best Local Similarity 53.3%; Pred. No. 50; 

Matches 8; Conservative 3; Mismatches 4; Indels 0; Gaps 

Qy 1 DLEMPVLPVEPFPFV 15 

1111:11: I : I 
Db 318 DAEPPWPVQVLPYV 332 



RESULT 3 

US-09-328-352-4873 

; Sequence 4873, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al. 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 



; TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 

FILE REFERENCE: GTC99-03PA 
; CURRENT APPLICATION NUMBER: US/09/328, 352 
; CURRENT FILING DATE: 1999-06-04 
; NUMBER OF SEQ ID NOS : 8252 
; SEQ ID NO 4873 
; LENGTH: 259 
TYPE: PRT 

; ORGANISM: Acinetobacter baumannii 
US-09-328-352-4873 



Query Match 54 . 3%; 

Best Local Similarity 58.3%; 
Matches 7; Conservative 

Qy 4 MPVLPVEPFPFV 15 

I I I I : I : I I : 
Db 166 MWRPVDPYPFI 177 



Score 44; DB 4; Length 259; 
Pred. No. 21; 
3; Mismatches 2; Indels 0; Gaps 



RESULT 4 

US-09-252-991A-28361 

; Sequence 28361, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLIC7\NT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

CURRENT FILING DATE: 1999-02-18 
; PRIOR APPLICATION NUMBER: US 60/074,788 
; PRIOR FILING DATE: 1998-02-18 
; PRIOR APPLICATION NUMBER: US 60/094,190 
; PRIOR FILING DATE: 1998-07-27 
; NUMBER OF SEQ ID NOS: 33142 
; SEQ ID NO 28361 

LENGTH: 242 

TYPE: PRT 
; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-28361 

Query Match 51.9%; Score 42; DB 4; Length 242; 

Best Local Similarity 58.3%; Pred. No. 40; 

Matches 7; Conservative 1; Mismatches 4; Indels 0; Gaps 

Qy 3 EMPVLPVEPFPF 14 

II I : I I I I 
Db 60 ESPQRPAQPFPF 71 



RESULT 5 

US-09-461-325-456 

; Sequence 456, Application US/09461325A 
; Patent No. 6475753 
; GENERAL INFORMATION: 



; APPLICANT: Ruben et al . 

TITLE OF INVENTION: 94 Human Secreted Proteins 

FILE REFERENCE: PZ029P1 
; CURRENT APPLICATION NUMBER: US/ 09/4 61 , 32 5A 
; CURRENT FILING DATE: 1999-12-14 
; EARLIER APPLICATION NUMBER: PCT/US 99/ 134 1 8 
; E7VRLIER FILING DATE: 1999-06-15 
; EARLIER APPLICATION NUMBER: 60/089,507 
; EARLIER FILING DATE: 1998-06-16 
; EARLIER APPLICATION NUMBER: 60/089,508 
; EARLIER FILING DATE: 1998-06-16 
; EARLIER APPLICATION NUMBER: 60/089,509 
; EARLIER FILING DATE: 1998-06-16 
; EARLIER APPLICATION NUMBER: 60/089,510 
; EARLIER FILING DATE: 1998-06-16 
; EARLIER APPLICATION NUMBER: 60/090,112 
; E7VRLIER FILING DATE: 1998-06-22 
; EARLIER APPLICATION NUMBER: 60/090,113 
; EARLIER FILING DATE: 1998-06-22 
; NUMBER OF SEQ ID NOS : 532 
; SOFTWARE: Patent In Ver. 2,0 
; SEQ ID NO 456 
LENGTH: 86 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-461-325-456 

Query Match 51.2%; Score 41.5; DB 4; Length 86; 

Best Local Similarity 47,1%; Pred. No. 16; 

Matches 8; Conservative 3; Mismatches 1; Indels 5; Gaps 1; 

Qy 2 LEMPVLP VEPFP 13 

I I : I : I I : I I I 

Db 17 LEVPILPTHHLLIHPFP 33 



RESULT 6 

US-10-012-542-456 

; Sequence 456, Application US/10012542 

; Patent No. 6627741 

; GENERAL INFORMATION: 

; APPLICANT: Ruben et al. 

; TITLE OF INVENTION: 94 Human Secreted Proteins 
; FILE REFERENCE: PZ029P1 

CURRENT APPLICATION NUMBER: US/ 1 0/ 012 , 542 
; CURRENT FILING DATE: 2001-12-12 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 
PRIOR FILING DATE: EARLIER FILING DATE: 1999-12-14 
PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 



09/461,325 
60/089, 507 
60/089, 508 
60/089,509 
60/089, 510 
60/090, 112 



; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/090,113 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 
; NUMBER OF SEQ ID NOS : 532 
; SOFTWARE: PatentIn Ver. 2.0 
; SEQ ID NO 456 
; LENGTH: 8 6 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-012-542-456 

Query Match 51.2%; Score 41.5; DB 4; Length 86; 

Best Local Similarity 47.1%; Pred. No. 16; 

Matches 8; Conservative 3; Mismatches 1; Indels 5; Gaps 1; 

Qy 2 LEMPVLP VEPFP 13 

I I : I : I I : I I I 

Db 17 LEVPILPTHHLLIHPFP 33 



RESULT 7 

US-09-621-976-5226 

; Sequence 5226, Application US/09621976 
; Patent No. 6639063 
; GENERAL INFORMATION: 

; APPLICTKNT: Dumas Milne Edwards, J.B. 

; APPLICANT : Jobert, S . 

; APPLICANT: Giordano, J.Y, 

; TITLE OF INVENTION: ESTs and Encoded Human Proteins, 

; FILE REFERENCE: GENSET . 054PR2 

; CURRENT APPLICATION NUMBER: US/ 09/ 62 1 , 97 6 

CURRENT FILING DATE: 2000-07-21 
; NUMBER OF SEQ ID NOS: 19335 

SOFTWARE: Patent. pm 
; SEQ ID NO 5226 
LENGTH: 143 
TYPE: PRT 
; ORGANISM: Homo sapiens 
FEATURE : 

NAME/ KEY: SIGNAL 

LOCATION: -15.. ~1 

NAME/KEY: UNSURE 

LOCATION: 111 
; OTHER INFORMATION: Xaa = Ala, Pro 
US-09-621-976-5226 



Query Match 49.4%; Score 40; DB 4; Length 143; 

Best Local Similarity 57.1%; Pred. No. 46; 

Matches 8; Conservative 1; Mismatches 5; Indels 0; Gaps 0; 

Qy 2 LEMPVLPVEPFPFV 15 

1:1111111 
Db 99 LNVPPLPPRGFPFV 112 



RESULT 8 
US-09-527-345-2 



; Sequence 2, Application US/09527345 

; Patent No. 6331413 

; GENERAL INFORMATION: 

; APPLICANT: Sheppard, Paul O. 

; APPLICANT: Adler, David A. 

; TITLE OF INVENTION: SECRETED SALIVARY ZSIG63 POLYPEPTIDE 
; FILE REFERENCE: 97-71 

; CURRENT APPLICATION NUMBER: US/ 09/ 527 , 34 5 

; CURRENT FILING DATE: 1999-03-17 

; PRIOR APPLICATION NUMBER: US 60/124,820 

; PRIOR FILING DATE: 1999-03-17 

; NUMBER OF SEQ ID NOS : 9 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 
; LENGTH: 219 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-527-345-2 

Query Match 49.4%; Score 40; DB 4; Length 219; 

Best Local Similarity 57.1%; Pred. No. 72; 

Matches 8; Conservative 1; Mismatches 5; Indels 0; Gaps 0; 

Qy 2 LEMPVLPVEPFPFV 15 

I : I I I MM 
Db 99 LNVPPLPPRGFPFV 112 



RESULT 9 

US-09-489-039A-9575 

; Sequence 9575, Application US/09489039A 

; Patent No. 6610836 

; GENERAL INFORMATION: 

; APPLICANT: Gary Breton et . al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

; TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 2709.2004001 

; CURRENT APPLICATION NUMBER: US/ 0 9/4 8 9 , 039A 

; CURRENT FILING DATE: 2000-01-27 

; PRIOR APPLICATION NUMBER: US 60/117,747 

; PRIOR FILING DATE: 1999-01-29 

; NUMBER OF SEQ ID NOS: 14342 

; SEQ ID NO 9575 

LENGTH: 577 

TYPE: PRT 

ORGANISM: Klebsiella pneumoniae 
US-09-4 8 9-039A-9575 

Query Match 49.4%; Score 40; DB 4; Length 577; 

Best Local Similarity 66.7%; Pred. No. 2e+02; 

Matches 8; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 
Qy 2 LEMPVLPVEPFP 13 

Ml Mill 

Db 302 LEMDTLPVSPAP 313 



RESULT 10 

US-09-198-452A-211 

; Sequence 211, Application US/09198452A 

; Patent No. 6559294 

; GENERAL INFORMATION: 

; APPLICANT: Griffais, R. 

; TITLE OF INVENTION: Chlamydia pneumoniae genomic sequence and polypeptides, 
fragments 

TITLE OF INVENTION: thereof and uses thereof, in particular for the 
diagnosis, prevention 

; TITLE OF INVENTION: and treatment of infection 

FILE REFERENCE: 9710-003-999 
; CURRENT APPLICATION NUMBER: US/09/ 198 , 452A 
; CURRENT FILING DATE: 1998-11-24 
; NUMBER OF SEQ ID NOS : 6849 
; SEQ ID NO 211 

LENGTH: 22 0 

TYPE: PRT 
; ORGANISM: Chlamydia pneumoniae 
US-09-198-452A-211 

Query Match 48.1%; Score 39; DB 4; Length 220; 

Best Local Similarity 70.0%; Pred. No. le+02; 

Matches 7; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 6 VLPVEPFPFV 15 

111:11 I I 
Db 82 VLPIEPTPLV 91 



RESULT 11 
US-09-149-476-374 

; Sequence 374, Application US/09149476 

; Patent No. 6420526 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al. 

; TITLE OF INVENTION: 186 Human Secreted proteins 
; FILE REFERENCE: PZ002P1 

; CURRENT APPLICATION NUMBER: US/0 9/ 14 9 , 4 7 6 
; CURRENT FILING DATE: 1998-09-08 
; EARLIER APPLICATION NUMBER: PCT/US 98 / 04 4 93 
; EARLIER FILING DATE: 1998-03-06 

EARLIER APPLICATION NUMBER: 60/040,162 

EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,333 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/038,621 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,626 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,334 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,336 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,163 
; EARLIER FILING DATE: 1997-03-07 



EARLIER APPLICATION NUMBER: 60/047,600 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,615 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,597 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,502 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,633 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,583 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,617 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,618 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,503 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,592 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,581 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,584 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,500 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,587 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,492 
ETU^LIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,598 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,613 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,582 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,596 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,612 
EARLIER FILING DATE: 1997-05-23 
E7VRLIER APPLICATION NUMBER: 60/047,632 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,601 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,580 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,568 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,314 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,569 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,311 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,671 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,674 



EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,669 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,312 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,313 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,672 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,315 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/048,974 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/056,886 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,877 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,889 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,893 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,630 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,878 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,662 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,872 
ET^LIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,882 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,637 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,903 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,888 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,879 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,880 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,894 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,911 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,636 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,874 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,910 
EARLIER FILING DATE: 1997-08-22 
Ei^RLIER APPLICATION NUMBER: 60/056, 864 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,631 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,845 
EARLIER FILING DATE: 1997-08-22 



EARLIER APPLICATION NUMBER: 60/056,892 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,761 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/047,595 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,599 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,588 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,585 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,586 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,590 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,594 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,589 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,593 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,614 
EARLIER FILING DATE: 1997-05-23 
E7VRLIER APPLICATION NUMBER: 60/043,578 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,576 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/047,501 
EARLIER FILING DATE: 1997-05-23 
E7VRLIER APPLICATION NUMBER: 60/043, 670 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/056,632 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,664 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,876 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,881 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,909 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,875 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,862 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,887 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,908 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/048,964 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/057,650 
EARLIER FILING DATE: 1997-09-05 
EARLIER APPLICATION NUMBER: 60/056,884 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,669 



; EARLIER FILING DATE: 1997-09-05 

; EARLIER APPLICATION NUMBER: 60/049,610 

; EARLIER FILING DATE: 1997-06-13 

; EARLIER APPLICATION NUMBER: 60/061,060 

; EARLIER FILING DATE: 1997-10-02 

Query Match 48.1%; Score 39; DB 4; Length 373; 

Best Local Similarity 41.7%; Pred. No. 1.8e+02; 

Matches 5; Conservative 5; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPF 12 

: : I : I I : : I I 
Db 104 EMEVPQAPIQPF 115 



RESULT 12 
US-08-888-429A-22 

; Sequence 22, Application US/08888429A 
; Patent No. 6136596 
; GENERAL INFORMATION: 

APPLICANT: Davis, Roger J. 
APPLICANT: Whitmarsh, Alan 
APPLICANT: Tournier, Cathy 
; TITLE OF INVENTION: CYTOKINE-, STRESS-, AND ONCOPROTEIN- 

TITLE OF INVENTION: ACTIVATED HUMAN PROTEIN KINASE KINASES 
NUMBER OF SEQUENCES: 34 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

; COUNTRY: USA 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: Windows95 
SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/888 , 429A 

; FILING DATE: 07-JUL-1997 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/530,950 
FILING DATE: 19-SEP-1995 
APPLICATION NUMBER: 08/446,083 
FILING DATE: 19-MAY-1995 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, Peter J. 

REGISTRATION NUMBER: 32,983 
; REFERENCE/DOCKET NUMBER: 07917/053001 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617/542-5070 

; TELEFAX: 617/542-8906 

; TELEX: 299354 

; INFORMATION FOR SEQ ID NO: 22: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4 05 amino acids 



; TYPE: amino acid 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 
US-08-888-429A-22 

Query Match 48.1%; Score 39; DB 3; Length 405; 

Best Local Similarity 58.8%; Pred. No. 2e+02; 

Matches 10; Conservative 1; Mismatches 4; Indels 2; Gaps 1; 

Qy 1 DLEMPVLPVEPF — PFV 15 

I : I I M I I III 
Db 32 8 DEDSPVLPVGEFSEPFV 344 



RESULT 13 
US-09-593-653-22 

; Sequence 22, Application US/09593653 
; Patent No. 6610523 

GENERAL INFORMATION: 

APPLICANT: Davis, Roger J. 
; Whitmarsh, Alan 

; Tournier, Cathy 

TITLE OF INVENTION: CYTOKINE-, STRESS-, AND ONCOPROTEIN- 

ACTIVATED HUMAN PROTEIN KINASE KINASES 
; NUMBER OF SEQUENCES: 34. 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 

COUNTRY: USA 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: Windows95 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 0 9/593 , 653 

FILING DATE: 13-Jun-2000 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 8 8 8 , 42 9A 

FILING DATE: 07-JUL-1997 

APPLICATION NUMBER: 08/530,950 

FILING DATE: 19-SEP-1995 

APPLICATION NUMBER: 08/446,083 

FILING DATE: 19-MAY-1995 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, Peter J. 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07917/053001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617/542-5070 

TELEFAX: 617/542-8906 

TELEX: 299354 
INFORMATION FOR SEQ ID NO: 22: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 4 05 amino acids 
; TYPE: amino acid 

; TOPOLOGY: linear 

MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
US-09-593-653-22 

Query Match 4 8.1%; Score 39; DB 4; Length 4 05; 

Best Local Similarity 58.8%; Pred. No. 2e+02; 

Matches 10; Conservative 1; Mismatches 4; Indels 2; Gaps 1; 

Qy 1 DLEMPVLPVEPF — PFV 15 

I : I I I I I I III 
Db 32 8 DEDSPVLPVGEFSEPFV 344 



RESULT 14 

US-09-252-991A-2 077 5 

; Sequence 20775, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al , 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/ 09/252 , 9 91A 
; CURRENT FILING DATE: 1999-02-18 
; PRIOR APPLICATION NUMBER: US 60/074,788 
; PRIOR FILING DATE: 1998-02-18 

PRIOR APPLICATION NUMBER: US 60/094,190 
; PRIOR FILING DATE: 1998-07-27 
; NUMBER OF SEQ ID NOS: 33142 
; SEQ ID NO 20775 

LENGTH: 444 

TYPE: PRT 

; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-20775 

Query Match 48.1%; Score 39; DB 4; Length 444; 

Best Local Similarity 53.8%; Pred. No. 2.2e+02; 

Matches 7; Conservative 1; Mismatches 5; Indels 0; Gaps 0; 

Qy 2 LEMPVLPVEPFPF 14 

I : I I I III 
Db 62 LALPVCPCRPSPF 74 



RESULT 15 
US-07-921-796-6 

; Sequence 6, Application US/07921796 

; Patent No. 5487990 

; GENERAL INFORMATION: 

APPLICANT: Smith, John A. 

APPLICANT: Lee, Fang- Jen S. 

APPLICANT: Lin, Lee-Wen 

TITLE OF INVENTION: The Glucose-Regulated Promoter of Yeast 



; TITLE OF INVENTION: Acetyl-CoA Hydrolase 

NUMBER OF SEQUENCES: 11 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sterne, Kessler, Goldstein & Fox 
; STREET: 1100 New York Avenue, Suite 600 

CITY: Washington 
; STATE: D.C, 

COUNTRY: USA 
; ZIP: 20005 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: PatentIn Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/921,7 96 
FILING DATE: 30-JUL-1992 
; CLASSIFICATION: 435 

ATTORNEY/AGENT INFORMATION: 
NAME: Sanzo, Michael A. 
REGISTRATION NUMBER: 36,912 
; REFERENCE/DOCKET NUMBER: 0609,1600003 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (202) 371-2600 

TELEFAX: (202) 371-2540 
; TELEX: 248636 SSK 

INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 526 amino acids 
TYPE: amino acid 
; TOPOLOGY: linear 

MOLECULE TYPE: protein 
US-07-921-796-6 



Query Match 4 8.1%; 

Best Local Similarity 41.2%; 
Matches 7 ; Conservative 



Score 39; DB 1; Length 526; 
Pred. No, 2,6e+02; 
6; Mismatches 2; Indels 



Qy 1 DLEMPVLPV — EPFPFV 15 

I : : I I I I : I : I : : 
Db 191 DIDMPVNPPFRKPYPYL 207 

Search completed: August 24, 2004, 15:55:15 
Job time : 17.4552 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



August 24, 2004, 15:26:28 ; Search time 14.5522 Seconds 

(without alignments) 
99.151 Million cell updates/sec 

US-09-641-801-5 
81 

1 DLEMPVLPVEPFPFV 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283366 



Database 



PIR__78:* 
pirl : * 
pir2 : * 
pir3 : ^ 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T24429 

hypothetical protein T04A8.11 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text change 15-Oct-1999 
C; Accession : T24429 
R; Palmer, S . 

submitted to the EMBL Data Library, August 1994 
A;Reference number: Z19889 
A; Accession: T24 42 9 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-219 <WIL> 

A;Cross-references: EMBL:Z35663; PIDN : CAA84730 . 1 ; GSPDB: GN00021 ; CESP : T04A8 . 11 

A; Experimental source: clone T04A8 

C; Genetics : 

A; Gene : CESP : T04A8 . 11 

A;Map position: 3 

A;Introns: 12/1; 40/2; 148/3 



Query Match 55.6%; Score 45; DB 2; Length 219; 

Best Local Similarity 63.6%; Pred. No. 5.9; 

Matches 7; Conservative 3; Mismatches 1; Indels 0; Gaps 

Qy 2 LEMPVLPVEPF 12 

I : : I I : I III 
Db 21 LKLPVMPAEPF 31 



RESULT 2 
H82264 

probable capK protein VC0924 [imported] - Vibrio cholerae (strain N16961 
serogroup 01) 

C; Species: Vibrio cholerae 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text_change 02-Feb-2001 
C; Accession: H822 64 

R;Heidelberg, J.F.; Eisen, J.A. ; Nelson^ W.C.; Clayton, R.A. ; Gwinn, M.L.; 
Dodson, R.J.; Haft, D.H.; Hickey, E.K.; Peterson, J.D.; Umayam, L.A,; Gill, 
S.R.; Nelson, K.E.; Read, T.D.; Tettelin, H.; Richardson, D.; Ermolaeva, M.D.; 
Vamathevan, J.; Bass, S.; Qin, H.; Dragoi, I.; Sellers, P.; McDonald, L. ; 
Utterback, T.; Fleishmann, R.D.; Nierman, W.C.; White, O.; Salzberg, S.L.; 
Smith, H.O.; Colwell, R.R.; Mekalanos, J.J,; Venter, J.C.; Eraser, CM. 
Nature 406, 477-483, 2000 

A; Title: DNA Sequence of both chromosomes of the cholera pathogen Vibrio 
cholerae . 

A;Reference number: A82035; MUID: 20406833; PMID : 10952301 
A; Accession: H822 64 
A; Status: preliminary 
A;Molecule type: DNA 
A; Residues: 1-446 <HEI> 

A; Cross-references: GB:7i^004175; GB:AE003852; NID : g9655366; PIDN:AAF94086. 1; 
GSPDB:GN00126; TIGR:VC0924 

A; Experimental source: serogroup 01; strain N16961; biotype El Tor 
C; Genetics : 
A; Gene: VC0924 
A;Map position: 1 



Query Match 55.6%; 
Best Local Similarity 53.3%; 
Matches 8; Conservative 



Score 45; DB 2; 
Pred. No. 13; 
4 ; Mismatches 3 ; 



Length 446; 
Indels 



0; Gaps 



0; 



Qy 



Db 



76 



DLEMPVLPVEPFPFV 15 
111:1 I : I II:: 
DLEVPNLELEAFPYL 90 



RESULT 3 
H69490 

f ormylmethanof uran dehydrogenase (tungsten) chain B homolog (fwdB-2) - 

Archaeoglobus fulgidus 

C; Species: Archaeoglobus fulgidus 

C;Date: 05-Dec~1997 #sequence_revision 05-Dec-1997 #text_change 24-Sep-1999 
C; Accession: H69490 

R;Klenk, H.P.; Clayton, R.A. ; Tomb, J.F.; White, 0.; Nelson, K.E.; Ketchum, 
K.A. ; Dodson, R.J.; Gwinn, M. ; Hickey, E.K.; Peterson, J.D.; Richardson, D.L.; 
Kerlavage, A.R.; Graham, D.E.; Kyrpides, N.C.; Fleischmann, R.D.; Quackenbush, 
J.; Lee, N.H.; Sutton, G.G.; Gill, S.; Kirkness, E.F.; Dougherty, B.A.; McKenny, 



K. ; Adams, M.D.; Loftus, B. 
J.H.; Glodek, A.; Zhou, L.; 
McDonald, L, 

Nature 390, 364-370, 1997 
A; Authors: Utterback, T. ; Cotton, M.D.; 
Sykes, S.M.; Sadow, P.W.; 
Mason, T.M,; Olsen, G.J.; 



Peterson, S.; Reich, C.I.; 
Overbeek, R. ; Gocayne, J.D. 



McNeil, L.K.; Badger, 
; Weidman, J. F. ; 



D'Andrea, K.P 
Fraser, CM. ; 



Spriggs, T.; Artiach, P.; Kaine, B.P.; 
. ; Bowman, C. ; Fujii, C; Garland, S.A. ; 
Smith, H.O.; Woese, C.R.; Venter, J.C. 



A; Title: The complete genome sequence of the hyperthermophilic, sulf ate-reducing 
archaeon Archaeoglobus f ulgidus . 

A; Reference number: A69250; MUID : 98049343 ; PMID: 9389475 
A; Accession: H694 90 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A; Residues: 1-421 <KLE> 

A; Cross-references: GB:AE000970; GB:AE000782; NID: g2689293 ; PIDN : AAB8 932 6 . 1 ; 
PID:g2648615; TIGR:AF1929 

C; Superfamily: f ormylmethanof uran dehydrogenase (molybdenum) beta chain 
C; Keywords: iron-sulfur protein; metalloprotein; molybdenum; molybdopterin 



Query Match 53.1%; 
Best Local Similarity 54.5%; 
Matches 6; Conservative 



Score 43; DB 2; 
Pred. No. 26; 
4; Mismatches 



Length 421; 



1; Indels 



0; Gaps 



0; 



Qy 

Db 



3 EMPVLPVEPFP 13 
|:||: ::|ll 
350 EIPVIQIDPFP 360 



RESULT 4 
T24690 

hypothetical protein T08D2.8 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C;Accession: T24690 
R;McMurray, A. 

submitted to the EMBL Data Library, March 1997 
A;Reference number: Z19924 
A; Accession: T24 690 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-556 <WIL> 

A; Cross-references: EMBL:Z92839; PIDN : CAB07415 . 1 ; CESP:T08D2.8 

A; Experimental source: clone T08D2 

C; Genetics : 

A;Gene: CESP:T08D2.8 

A;Introns: 5/1; 54/3; 80/3; 255/3; 307/3; 352/3; 479/1 

Query Match 53.1%; Score 43; DB 2; Length 556; 

Best Local Similarity 72.7%; Pred. No. 36; 

Matches 8; Conservative 1; Mismatches 2; Indels 0; Gaps 

Qy 5 PVLPVEPFPFV 15 

II I I : I III 
Db 443 PVAPVKPKPFV 453 



RESULT 5 



S54532 

probable membrane protein YDR236c - yeast ( Saccharomyces cerevisiae) 
N; Alternate names: hypothetical protein YD8419.03c 
C; Species: Saccharomyces cerevisiae 

C;Date: 08-Jul-1995 #sequence_revision Ol-Sep-1995 #text_change 19-Apr-2002 
C; Accession: S54532 
R;01iver, K. ; Harris, D. 

submitted to the EMBL Data Library, May 1995 
A;Reference number: S54530 
A; Accession: S54532 
A; Molecule type: DNA 
A; Residues: 1-218 <OLI> 

A;Cross-references : EMBL:Z49701; NID:g817819; PID:g817822; GSPDB : GN00004 ; 
MIPS:YDR236c 

A; Experimental source: strain AB972 
C; Genetics : 

A; Gene: SGD:FMN1; MIPS:YDR236c 
A; Cross-references : SGD: S0002644 
A;Map position: 4R 
C; Keywords: transmembrane protein 

F; 4-20/Domain: transmembrane #status predicted <TMM> 

Query Match 51.9%; Score 42; DB 2; Length 218; 

Best Local Similarity 53.3%; Pred. No. 18; 

Matches 8; Conservative 1; Mismatches 6; Indels 0; Gaps 

Qy 1 DLEMPVLPVEPFPFV 15 

I I : I I I I I I 
Db 37 DLPIPAQPGPPFPLV 51 



RESULT 6 
JC6530 

laminin receptor processed pseudogene LAMRL5 - human 
C; Species: Homo sapiens (man) 

C;Date: 28-Aug-1998 #sequence_revision 28-Aug-1998 #text_change 28-Aug-1998 
C;Accession: JC6530 

R; Richardson, M.P.; Braybrook, C; Tham, M. ; Moore, G.E.; Stanier, P. 
Gene 206, 145-150, 1998 

A; Title: Molecular cloning and characterization of a highly conserved human 

kDa laminin receptor pseudogene mapping to Xq21.3. 

A; Reference number: JC6530; MUID : 9812 1324 ; PMID: 9461426 

A;Accession: JC6530 

A; Status: conceptual translation of pseudogene 

A;Molecule type: DNA 

A; Residues: 1-295 <RIC> 

A; Experimental source: brain 

C; Comment: No evidence could be found that this intronless gene sequence is 

expressed . 

C; Genetics : 

A; Gene: LAMRL5 

A;Map position: Xq21.3 

A;Introns: #status absent 

C; Keywords: brain; glycoprotein; laminin binding; pseudogene; receptor 



Query Match 51.9%; Score 42; DB 4 ; Length 295; 

Best Local Similarity 46.2%; Pred. No. 26; 



Matches 



6; 



Conservative 



5; 



Mismatches 



2; 



Indels 



0; Gaps 



0; 



Qy 



1 DLEMPVLPVEPFP 13 



Db 



251 EVEVPSVPIEEFP 263 



RESULT 7 
T24298 

hypothetical protein T0lE8,2 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C;Accession: T24298 
R;McMurray, A. 

submitted to the EMBL Data Library, March 1995 
A; Reference number: Z19871 
A; Accession: T2429 8 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-367 <WIL> 

A; Cross-references: EMBL:Z48809; PIDN : CAA8 874 4 . 1 ; GSPDB : GN0002 0 ; CESP:T01E8.2 

A; Experimental source: clone T01E8 

C; Genetics : 

A;Gene: CESP:T01E8.2 

A;Map position: 2 

A;Introns: 48/2; 200/2; 254/3 

Query Match 51.9%; Score 42; DB 2; Length 367; 

Best Local Similarity 60.0%; Pred. No. 33; 

Matches 6; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFPF 14 

I : I MM 
Db 295 PIFPFRPFPF 304 



RESULT 8 
AF1942 

hypothetical protein alll089 [imported] - Nostoc sp. (strain PCC 7120) 
C; Species: Nostoc sp. PCC 7120 

A; Note: Nostoc sp. strain PCC 7120 is a synonym of Anabaena sp. strain PCC 7120 
C;Date: 14-Dec-2001 #sequence_revision 14-Dec-2001 #text_change 09-Dec-2002 
C; Accession : AF1942 

R;KanekO;. T.; Nakamura, Y. ; Wolk, CP.; Kuritz, T.; Sasamoto, S.; Watanabe, A.; 
Iriguchi^ M. ; Ishikawa^ A.; Kawashima, K.; Kimura, T.; Kishida, Y.; Kohara, M. ; 
Matsumoto, M. ; Matsuno, A.; Muraki^ A.; Nakazaki, N. ; Shimpo, S.; Sugimoto, M. ; 
Takazawa, M. ; Yamada^. M. ; Yasuda, M. ; Tabata^. S. 
DNA Res. 8, 205-213, 2001 

A; Title: Complete Genomic Sequence of the Filamentous Nitrogen-fixing 

Cyanobacterium Anabaena sp. strain PCC 7120. 

A; Reference number: AB1807; MUID : 2159528 5 ; PMID : 1175984 0 

A; Access ion : AF1942 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-44 8 <KUR> 

A; Cross-references : GB:BA000019; PIDN : BAB7304 6 . 1 ; PID: gl7130435 ; GSPDB : GN00179 
A; Experimental source: strain PCC 712 0 



C; Genetics : 
A;Gene: alll089 



Query Match 51.9%; Score 42; DB 2; Length 448; 

Best Local Similarity 60.0%; Pred. No. 41; 

Matches 6; Conservative 4; Mismatches 0; Indels 0; Gaps 0; 

Qy 3 EMPVLPVEPF 12 

I : I : I I I : I : 
Db 98 ELPLLPVDPY 107 



RESULT 9 
AF2433 

aldehyde dehydrogenase [imported] - Nostoc sp . (strain PCC 7120) 
C;Species: Nostoc sp. PCC 7120 

A;Note: Nostoc sp. strain PCC 7120 is a synonym of Anabaena sp. strain PCC 7120 
C;Date: 14-Dec-2001 #sequence_revision 14-Dec-2001 #text_change 09-Dec-2002 
C; Accession : AF2433 

R;Kaneko, T.; Nakamura, Y.; Wolk, CP.; Kuritz, T.; Sasamoto, S.; Watanabe, A.; 
Iriguchi, M. ; Ishikawa, A.; Kawashima, K. ; Kimura^, T.; Kishida, Y. ; Kohara, M. ; 
Matsumoto, M. ; Matsuno, A.; Muraki, A.; Nakazaki, N. ; Shimpo^ S.; Sugimoto, M. ; 
Takazawa, M. ; Yamada, M. ; Yasuda, M. ; Tabata, S. 
DNA Res. 8, 205-213, 2001 

A;Title: Complete Genomic Sequence of the Filamentous Nitrogen-fixing 

Cyanobacterium Anabaena sp . strain PCC 7120. 

A; Reference number: AB1807; MUID : 21595285 ; PMID : 1175984 0 

A; Accession: AF2433 

A; Status: preliminary 

A;Molecule type: DNA 

A; Residues: 1-473 <KUR> 

A; Cross-references: GB:BA000019; PIDN: BAB7 6721 . 1; PID: gl7134160; GSPDB: GN00179 
A; Experimental source: strain PCC 7120 
C; Genetics : 
A;Gene: all5022 

C; Superf amily : aldehyde dehydrogenase (NAD+) ; aldehyde dehydrogenase homology 

Query Match 51.9%; Score 42; DB 2; Length 473; 

Best Local Similarity 63.6%; Pred. No. 44; 

Matches 7; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFPFV 15 

I : : I I III I 
Db 365 PIMPVMPFPDV 375 



RESULT 10 
T07447 

DNA-directed RNA polymerase (EC 2.7.7.6) beta'-l chain - Japanese black pine 
chloroplast (fragment) 

C; Species: chloroplast Pinus thunbergiana (Japanese black pine) 

C;Date: 14-May-1999 #sequence_revision 14-May-1999 #text_change 18-Aug-2000 

C;Accession: T07447 

R;Wakasugi, T.; Tsudzuki, J.; Ito, S.; Nakashima, K. ; Tsudzuki, T.; Sugiura, M. 
Proc. Natl. Acad. Sci. U.S.A. 91, 9794-9798, 1994 

A; Title: Loss of all ndh genes as determined by sequencing the entire 
chloroplast genome of the black pine Pinus thunbergii. 



A;Reference number: Z16030; MUID : 95024 047 ; PMID:7937893 
A;Accession: T07447 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-696 <WAK> 

A;Cross-references : EMBL:D17510; NID:g529643; PIDN: BAA23472 . 1 ; PID:g2626945 

C; Genetics : 

A; Gene: rpoCl 

A; Genome: chloroplast 

A;Note: intron positions not resolved (incomplete sequence) 

C; Superf amily : chloroplast DNA-directed RNA polymerase beta'-l chain 

C; Keywords: chloroplast; nucleotidyltransferase; transcription 

Query Match 51.9%; Score 42; DB 2; Length 696; 

Best Local Similarity 66.7%; Pred. No. 68; 

Matches 8; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 4 MPVLPVEPFPFV 15 

: I I I I I I I I 
Db 285 LPVLPPEPRPIV 296 



RESULT 11 
T17426 

FK506 polyketide synthetase fkbP [imported] - Streptomyces sp . (strain MA6548) 
C; Species: Streptomyces sp . 
A;Variety: strain MA6548 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 03-Nov-2000 

C; Accession: T17426 

R;Motamedi, H.; Shafiee, A. 

Eur. J. Biochem. 256, 528-534, 1998 

A; Title: The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK506. 

A;Reference number: Z18779; MUID: 98451508 ; PMID:9780228 
A; Accession: T1742 6 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1504 <MOT> 

A;Cross-references : EMBL : AF082100 ; NID : g3798623 ; PID : g3798625 ; PIDN :7^C68816 . 1 
A; Experimental source: strain MA6548 
C; Genetics : 
A; Gene: fkbP 
C; Function: 

A; Description : required during the biosynthesis of the immunosuppressant FK506 
for the activation and incorporation of the pipecolate moiety into the completed 
acyl chain 

C; Superf amily : Mycobacterium tuberculosis mbtE protein; acetate-CoA ligase 
homology; acyl carrier protein homology 

C; Keywords: carrier protein; phosphopantetheine ; phosphoprotein 

F; 533-982/Domain: acetate-CoA ligase homology <ACL> 

F; 999-1067/Domain : acyl carrier protein homology <ACP> 

F; 1031/Binding site: phosphopantetheine (Ser) (covalent) #status predicted 



Query Match 51.9%; Score 42; DB 2; Length 1504; 

Best Local Similarity 53.3%; Pred. No. 1.6e+02; 

Matches 8; Conservative 2; Mismatches 5; Indels 0; Gaps 0; 



Qy 1 DLEMPVLPVEPFPFV 15 

: I : I I I I I I I 
Db 1423 ELRLPGLRTEPFPW 1437 



RESULT 12 
B97432 

glucose-6-phosphate 1-dehydrogenase (g6pd) [imported] - Agrobacterium 

tumefaciens (strain C58, Cereon) 

C; Species: Agrobacterium tumefaciens 

C;Date: 30-Sep-2001 #sequence_revision 30-Sep-2001 #text_change 18-Nov-2002 
C; Accession: B97432 

R;Goodner, B. ; Hinkle, G.; Gattung, S.; Miller^ N.; Blanchard, M. ; Qurollo, B.; 
Goldman, B.S.; Cao, Y. ; Askenazi, M. ; Hailing, C; Mullin, L. ; Houmiel, K. ; 
Gordon, J.; Vaudin, M. ; lartchouk, O.; Epp, A.; Liu, F. ; Wollam, C; Allinger, 
M. ; Doughty, D.; Scott, C. ; Lappas, C; Markelz, B.; Flanagan, C; Crowell, C; 
Gurson, J.; Lomo, C, ; Sear, C. ; Strub, G. ; Cielo, C. ; Slater, S, 
Science 294, 2323-2328, 2001 

A; Title: Genome Sequence of the Plant Pathogen and Biotechnology Agent 
Agrobacterium tumefaciens C58. 

A;Reference number: A97359; MUID : 21608551 ; PMID : 11743194 
A;Accession: B97432 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-503 <KUR> 

A;Cross-references: GB:AE007869; PIDN : AAK86411 . 1 ; PID : gl5155545 ; GSPDB : GN00169 

C; Genetics: 

A; Gene: AGR_C_1065 

A;Map position: circular chromosome 

C; Super family : glucose-6-phosphate dehydrogenase 

Query Match 50.6%; Score 41; DB 2; Length 503; 

Best Local Similarity 46.7%; Pred. No. 68; 

Matches 7; Conservative 3; Mismatches 5; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I : : : I I I I I I 
Db 12 DMSSQIIPVEPFDCV 26 



RESULT 13 
T41201 

isoleucyl-trna synthetase - fission yeast ( Schizosaccharomyces pombe) 
C; Species: Schizosaccharomyces pombe 

C;Date: 03-Dec-1999 #sequence_revision 03-Dec-1999 #text_change 21-Jan-2000 
C; Accession: T412 01 

R;Wood, v.; Rajandream, M.A. ; Barrell, B.G.; Jimenez Martinez, J. 
submitted to the EMBL Data Library, July 1999 
A;Reference number: Z21978 
A; Accession: T412 01 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-973 <W00> 

A;Cross-references: EMBL:i\L109736; PIDN : CAB52 155 . 1; GSPDB : GN00068 

A; Experimental source: strain 972h-; cosmid cl8B5 

C; Genetics : 

A; Gene: SPBC18B5. 08c 



A;Map position: 3 

C; Super family : isoleucine-tRNA ligase 



Query Match 50.6%; Score 41; DB 2; Length 973; 

Best Local Similarity 53.8%; Pred. No. 1.4e+02; 

Matches 7; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 3 EMPVLPVEPFPFV 15 

I I : I I : I I I : 
Db 325 ENPLLPKQSFPFL 337 



RESULT 14 
T00333 

hypothetical protein KIAA0560 - human 
C; Species: Homo sapiens (man) 

C;Date: Ol-Feb-1999 #sequence_revision Ol-Feb-1999 #text_change ll-Jan-2002 
C; Accession: TO 0333 

R;Nagase, T.; Ishikawa, K. ; Miyajima, N. ; Tanaka, A.; Kotani, H.; Nomura, N. ; 
Ohara, O. 

DNA Res. 5, 31-39, 1998 

A; Title: Prediction of the coding sequences of unidentified human genes. IX. The 
complete sequences of 100 new cDNA clones from brain which can code for large 
proteins in vitro. 

A;Reference number: Z14086; MUID : 98290545 ; PMID:9628581 
A;Accession: T00333 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1421 <NAG> 

A;Cross-references : EMBL : AB011132 ; NID : dll85402 ; PIDN : BAA25486 . 1 
A; Experimental source: brain; clone HH164 8 
C; Genetics : 
A;Note: KIAA0560 

Query Match 50.6%; Score 41; DB 2; Length 1421; 

Best Local Similarity 50.0%; Pred. No. 2.2e+02; 

Matches 6; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 2 LEMPVLPVEPFP 13 

I : : : I I I I I 
Db 1264 LHLHIIPTEPFP 1275 



RESULT 15 
AG3000 

l-acyl-sn-glycerol-3-phosphate acyltrans f erase plsC [imported] - Agrobacterium 

tumefaciens (strain C58, Dupont) 

C; Species: Agrobacterium tumefaciens 

C;Date: ll-Jan-2002 #sequence_revision ll-Jan-2002 #text_change 18-Nov-2002 
C;Accession: AG3000 

R;Wood, D.W.; Setubal, J.C.; Kaul, R. ; Monks, D.; Chen, L. ; Wood, G.E.; Chen, 
Y.; Woo, L.; Kitajima, J. P.; Okura, V.K.; Almeida Jr., N.F.; Zhou, Y.; Bovee 
Sr., D.; Chapman, P.; Clendenning, J.; Deatherage, G. ; Gillet, W. ; Grant, C. ; 
Guenthner, D.; Kutyavin, T.; Levy, R. ; Li, M. ; McClelland, E.; Palmieri, A.; 
Raymond, C; Rouse, G. ; Saenphimmachak, C. ; Wu, Z.; Gordon, D. ; Eisen, J. A.; 
Paulsen, I.; Karp, P.; Romero, P.; Zhang, S. 
Science 294, 2317-2323, 2001 



A;Authors: Yoo, H.; Tao, Y. ; Biddle, P.; Jung, M. ; Krespan, W. ; Perry, M. ; 
Gordon-Kamm, B.; Liao, L.; Kim, S.; Hendrick, C; Zhao, 2.; Dolan, M. ; Tingey, 
S.V.; Tomb, J.; Gordon, M,P.; Olson, M.V. ; Nester, E.W. 

A; Title; The Genome of the Natural Genetic Engineer Agrobacterium tumefaciens 
C58. 

A; Reference number: AB2577; MUID : 21608550 ; PMID : 11743193 
A;Accession: AG3000 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-266 <KUR> 

A;Cross-references : GB:AE008689; PIDN : AAL4442 1 . 1 ; PID : gl7742021 ; GSPDB: GN00187 
A; Experimental source: strain C58 (Dupont) 
C; Genetics : 
A;Gene: plsC 

A;Map position: linear chromosome 

Query Match 49.4%; Score 40; DB 2; Length 266; 

Best Local Similarity 53.8%; Pred. No. 48; 

Matches 7; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFP 13 

I I : : I I : I I I 

Db 174 DLQVPVIPVAMHP 186 



Search completed: August 24, 2004, 15:52:51 
Job time : 18.5522 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score-: 
Sequence : 

Scoring table: 



Searched: 



August 24, 2004, 15:51:19 ; Search time 54.291 Seconds 

(without alignments) 
86.825 Million cell updates/sec 

US-09-641-801-5 
81 

1 DLEMPVLPVEPFPEV 15 
BLOSUM62 

Gapop 10.0 , Gapext 0.5 

1295152 seqs, 314255058 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 4 5 summaries 



1295152 



Database 



Published 
/ cgn2 
/cgn2 
/cgn2 
/ cgn2 
/cgn2_' 
/cgn2__ 
/cgn2_ 
/cgn2_ 
/cgn2_ 
/cgn2 
/cgn2 
/cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 
/ cgn2 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



Applications_AA: 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
6/ptodata/l/pubpa 
_6/ptodata/l/pubp 
_6/ptodata/l/pubp 
__6/ptodata/l/pubp 
_6/ptodata/ 1/pubp 
_6/ptodata/ 1/pubp 
_6/ptodata/ 1/pubp 
__6/ptodata/ 1/pubp 
_6/ptodat a/ 1/pubp 
6/ ptodata/ 1/pubp 



a/US 07_PUBCOMB . pep : * 
a/PCT_NEW_PUB . pep : * 
a/US 0 6_NEW_PUB . pep : * 
a/US 0 6_PUBC0MB . pep : * 
a/US 07_NEW_PUB . pep : * 
a/PCTUS_PUBCOMB.pep: * 
a/US08_NEW_PUB,pep: * 
a/US08_PUBCOMB.pep:* 
a/US09A_PUBCOMB.pep: * 
aa/US09B_PUBCOMB.pep: ^ 
aa/US09C_PUBCOMB.pep: ^ 
aa/US0 9_NEW_PUB.pep: * 
aa/USl0A_PUBCOMB.pep- * 
aa/USl0B_PUBCOMB.pep 
aa/US 1 0C_PUBCOMB . pep 
aa/US10_NEW__PUB.pep: * 
aa/US60_NEW_PUB.pep: * 
aa/US60_PUBCOMB.pep: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-10-281-652-5 

; Sequence 5, Application US/10281652 
; Publication No. US20030091606A1 
; GENERAL INFORMATION: 
; APPLICANT: STANTON, G. John 



; APPLICANT: HUGHES, Thomas K. 
; APPLICANT: BOLDOGH, Istvan 

; TITLE OF INVENTION: USE OF COLOSTRININ, CONSTITUENT PEPTIDES THEREOF, AND 

; TITLE OF INVENTION: ANALOGS THEREOF AS OXIDATIVE STRESS REGULATORS 

; FILE REFERENCE: 265.00220101 

; CURRENT APPLICATION NUMBER: US/10/281 , 652 

; CURRENT FILING DATE: 2002-10-28 

; PRIOR APPLICATION NUMBER: US/ 0 9/ 64 1 , 8 03 

; PRIOR FILING DATE: 2000-08-17 

; PRIOR APPLICATION NUMBER: 60/149,310 

; PRIOR FILING DATE: 1999-08-17 

; NUMBER OF SEQ ID NOS : 34 

; SOFTWARE: Patentin Ver . 2 . 1 

; SEQ ID NO 5 

; LENGTH: 15 

; TYPE: PRT 

; ORGANISM: Artificial Sequence 
; FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence: synthetic 

OTHER INFORMATION: peptide 
US-10-281-652-5 

Query Match 100.0%; Score 81; DB 14; Length 15; 

Best Local Similarity 100.0%; Pred. No. 2.3e-05; 

Matches 15; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I I I I I I I I I I I I I 
Db 1 DLEMPVLPVEPFPFV 15 



RESULT 2 

US-10-425-114-58B64 

Sequence 58864, Application US/10425114 
Publication No. US2 0040034 888A1 
GENERAL INFORMATION: 
APPLICANT: Liu, Jingdong 
APPLICANT: Zhou, Yihua 
APPLICANT: Kovalic, David K. 
APPLICANT: Screen, Steven E 
APPLICANT: Tabaska, Jack E 
APPLICANT: Cao, Yongwei 

TITLE OF INVENTION: Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-2 1 ( 53313 ) B 
CURRENT APPLICATION NUMBER: US/10/425 , 114 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS: 73128 
SEQ ID NO 58864 
LENGTH: 104 
TYPE: PRT 

ORGANISM: Zea mays 
FEATURE : 

OTHER INFORMATION: Clone ID: 700431848_FLI .pep 
US-10-425-114-58864 



Query Match 60.5%; Score 49; DB 12; Length 104; 

Best Local Similarity 60.0%; Pred. No. 8.5; 

Matches 9; Conservative 3; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

II : I I I I : II I : 
Db 61 DLWLPVLPFQPFLFL 75 



RESULT 3 

US-10-104-047-3111 

; Sequence 3111, Application US/10104047 
; Publication No. US20030236392A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESE7VRCH INSTITUTE 

TITLE OF INVENTION: No. US2 003023 63 92Alel full length cDNA 
; FILE REFERENCE: H1-A0105 

; CURRENT APPLICATION NUMBER: US/ 10/ 104 , 04 7 

; CURRENT FILING DATE: 2002-03-25 

; PRIOR APPLICATION NUMBER: 

; PRIOR FILING DATE: 

; NUMBER OF SEQ ID NOS : 4096 

; SOFTWARE: Patentin Ver . 2.1 

; SEQ ID NO 3111 

; LENGTH: 232 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-104-047-3111 

Query Match 59.3%; Score 48; DB 15; Length 232; 

Best Local Similarity 66.7%; Pred. No. 27; 

Matches 8; Conservative 2; Mismatches 2; Indels 0; Gaps 

Qy 3 EMPVLPVEPFPF 14 

: I I I I I I : M 
Db 2 09 KFPVLPVHPWPF 220 



RESULT 4 

US-10-2 82-122A-77182 

Sequence 77182, Application US/10282122A 
Publication No. US2004002912 9A1 
GENERAL INFORMATION: 
APPLIC7\NT : Wang, Liangsu 
APPLICANT: Zamudio, Carlos 
APPLICANT: Malone, Cheryl 
APPLICANT: Haselbeck, Robert 
APPLICANT: Ohlsen, Kari 
APPLICANT: Zyskind, Judith 
APPLICANT: Wall, Daniel 
APPLICANT: Trawick, John 
APPLICANT: Carr, Grant 
APPLICT^T: Yamamoto, Robert 
APPLIC7\NT: Forsyth, R. 
APPLICANT: Xu, H. 

TITLE OF INVENTION: Identification of Essential Genes in Microorganisms 
FILE REFERENCE: ELITRA. 034A 



; CURRENT APPLICATION NUMBER: US/ 1 0/2 82 , 122A 

; CURRENT FILING DATE: 2003-02-20 

; PRIOR APPLICATION NUMBER: 60/191,078 

; PRIOR FILING DATE: 2000-03-21 

; PRIOR APPLICATION NUMBER: 60/206,848 

; PRIOR FILING DATE: 2000-05-23 

; PRIOR APPLICATION NUMBER: 60/207,727 

; PRIOR FILING DATE: 2000-05-26 

; PRIOR APPLICATION NUMBER: 60/230,335 

; PRIOR FILING DATE: 2000-09-06 

; PRIOR APPLICATION NUMBER: 60/230,347 

; PRIOR FILING DATE: 2000-09-09 

; PRIOR APPLICATION NUMBER: 60/242,578 

; PRIOR FILING DATE: 2000-10-23 

PRIOR APPLICATION NUMBER: 60/253,625 
; PRIOR FILING DATE: 2000-11-27 
; PRIOR APPLICATION NUMBER: 60/257,931 
; PRIOR FILING DATE: 2000-12-22 
; PRIOR APPLICATION NUMBER: 60/267,636 
; PRIOR FILING DATE: 2001-02-09 
; PRIOR APPLICATION NUMBER: 60/269,308 
; PRIOR FILING DATE: 2001-02-16 

Remaining Prior Application data removed - See File Wrapper or PAiM. 
; NUMBER OF SEQ ID NOS: 78614 
; SOFTWARE: PatentIn version 3.1 
; SEQ ID NO 77182 
LENGTH: 446 
TYPE: PRT 
; ORGANISM: Vibrio cholerae 
US-10-282-122A-77182 

Query Match 55.6%; Score 45; DB 12; Length 446; 

Best Local Similarity 53.3%; Pred. No. 1.4e+02; 

Matches 8; Conservative 4; Mismatches 3; Indels 0; Gap 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I : I I : 1 II:: 
Db 7 6 DLEVPNLELEAFPYL 90 



RESULT 5 

US- 10-437-9 63-1 0 8 040 

Sequence 108040, Application US/10437963 
Publication No. US2004 0123343A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa, Thomas J. 
APPLICANT: Kovalic, David K. 
APPLICANT: Zhou, Yihua 
APPLICANT: Cao, Yongwei 
APPLICANT: Wu, Wei 
APPLICANT: Boukharov, Andrey A. 
APPLICANT: Barbazuk, Brad 
APPLICANT: Li, Ping 

TITLE OF INVENTION: Rice Nucleic Acid Molecules and Other Molecules 
Associated With 

; TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
; FILE REFERENCE: 38-21(53221)6 



CURRENT APPLICATION NUMBER: US/10/437 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS : 204966 
SEQ ID NO 108040 
LENGTH: 135 
TYPE; PRT 

ORGANISM: Oryza sativa 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4530_12332C . 1 . pep 
US-1 0-4 37-963-1 08 040 



Query Match 54.3%; 
Best Local Similarity 47.4%; 
Matches 9; Conservative 



Score 44; DB 16; Length 135; 
Pred. No. 60; 
2; Mismatches 4; Indels 



4; Gaps 



1; 



Qy 

Db 



1 DLEMPVLPVEP FPFV 15 

I I I : I : I I I I I 

75 DAPMPEIPIHPPPPVFPFV 93 



RESULT 6 

US-10-424-599-254766 

Sequence 254766, Application US/10424599 
Publication No. US20040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 
APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21 ( 53223 ) B 
CURRENT APPLICATION NUMBER: US/10/424, 599 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS: 285684 
SEQ ID NO 254766 
LENGTH: 32 5 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT38 4 7__72 07 6C . 1 .pep 
US-10-42 4-599-254766 



Query Match 53.1%; Score 43; DB 12; Length 325; 

Best Local Similarity 77.8%; Pred. No. 2e+02; 

Matches 7 ; Conservative 1 ; Mismatches 1 ; Indels 0 ; 



Gaps 



0; 



Qy 

Db 



5 PVLPVEPFP 13 

11:11 III 
191 PWPVHPFP 199 



RESULT 7 

US-1 0-42 4-599-28 4 024 

; Sequence 284024, Application US/10424599 
; Publication No. US2 004 0031072A1 



GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 
APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21(53223) B 
CURRENT APPLICATION NUMBER: US/10/424 , 599 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 285684 
SEQ ID NO 284024 
LENGTH: 68 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE: 

OTHER INFORMATION: Clone ID: PAT_MRT384 7_9849C . 1 . pep 
US- 10-424-5 9 9-2 84 024 

Query Match 51,9%; Score 42; DB 12; Length 68; 

Best Local Similarity 53.3%; Pred. No. 59; 

Matches 8; Conservative 1; Mismatches 6; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

I I I I MM: 
Db 44 DAEWPCLYKEPFPLI 58 



RESULT 8 

US-10-424-5 9 9-2 042 8 6 

Sequence 204286, Application US/10424599 
Publication No. US20040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 
APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21(53223)3 
CURRENT APPLICATION NUMBER: US/ 10/ 424 , 5 99 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS: 285684 
SEQ ID NO 204286 
LENGTH: 17 4 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE: 

OTHER INFORMATION: Clone ID: PAT_MRT38 47_2 64 98C . 1 . pep 
US-10-42 4-599-2 042 8 6 



Query Match 51.9%; 
Best Local Similarity 54.5%; 
Matches 6; Conservative 



Score 42; DB 12; Length 174; 
Pred. No. 1.5e+02; 
5 ; Mismatches 0 ; Indels 0 ; Gaps 



0; 



Qy 

Db 



1 DLEMPVLPVEP 11 
: : : : I I : I I M 
77 NIDIPVIPVEP 87 



RESULT 9 

US-10-424-599-144697 

Sequence 144697, Application US/10424599 
Publication No. US2 0040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 
APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21(5322 3)6 
CURRENT APPLICATION NUMBER: US/ 10/ 424 , 599 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 285684 
SEQ ID NO 144697 
LENGTH: 24 9 
TYPE: PRT 

0RG7KNISM: Glycine max 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT3847_101680C. 1 .pep 
US-10-424-599-144697 

Query Match 51.9%; Score 42; DB 12; Length 249; 

Best Local Similarity 80.0%; Pred. No. 2.2e+02; 

Matches 8; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 

Db 



5 PVLPVEPFPF 14 

I I I I I I II 
48 PVLPVEGLPF 57 



RESULT 10 

US-10-369-493-18781 

Sequence 18781, Application US/10369493 
Publication No. US2 0030233675A1 
GENERAL INFORMATION: 
APPLICANT: Cao, Yongwei 
APPLICANT: Hinkle, Gregory J. 
APPLICANT: Slater, Steven C. 
APPLICANT: Goldman, Barry S. 
APPLICANT: Chen, Xianfeng 

TITLE OF INVENTION: EXPRESSION OF MICROBIAL PROTEINS IN PLANTS FOR PRODUCTION 



OF 



TITLE OF INVENTION: PLANTS WITH IMPROVED PROPERTIES 

FILE REFERENCE: 38-10 ( 52052 ) B 

CURRENT APPLICATION NUMBER: US/10/369 , 4 93 

CURRENT FILING DATE: 2003-02-28 

PRIOR APPLICATION NUMBER: US 60/360,039 

PRIOR FILING DATE: 2002-02-21 

NUMBER OF SEQ ID NOS: 47374 



; SEQ ID NO 18781 
LENGTH: 457 
TYPE: PRT 

ORGj^NISM: Anabaena PCC7120 
US-10-369-493-18781 

Query Match 51.9%; Score 42; DB 15; Length 457; 

Best Local Similarity 63.6%; Fred. No. 4e+02; 

Matches 7; Conservative 2; Mismatches 2; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFPFV 15 

I : : I I III I 
Db 364 PIMPVMPFPDV 374 



RESULT 11 
US-10-115-123-456 

; Sequence 456, Application US/10115123 

; Publication No. US20030065151A1 

; GENERAL INFORMATION: 

; APPLICANT: Ruben et al . 

; TITLE OF INVENTION: 94 Human Secreted Proteins 

FILE REFERENCE: PZ 02 9G3 0AP1D2 
; CURRENT APPLICATION NUMBER: US/ 10/115 , 123 
; CURRENT FILING DATE: 2002-04-04 
; PRIOR APPLICATION NUMBER: PCT/US99/ 134 18 

PRIOR FILING DATE: 1999-06-15 
; PRIOR APPLICATION NUMBER: 60/089,507 
; PRIOR FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: 60/089,508 
; PRIOR FILING DATE: 1998-06-16 

PRIOR APPLICATION NUMBER: 60/089,509 
; PRIOR FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: 60/089,510 
; PRIOR FILING DATE: 1998-06-16 
; PRIOR APPLICATION NUMBER: 60/090,112 
; PRIOR FILING DATE: 1998-06-22 
; PRIOR APPLICATION NUMBER: 60/090,113 
; PRIOR FILING DATE: 1998-06-22 
; NUMBER OF SEQ ID NOS : 532 

SOFTWARE: Patentin Ver. 2.0 
; SEQ ID NO 456 
; LENGTH: 8 6 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-115-123-456 

Query Match 51.2%; Score 41.5; DB 12; Length 86; 

Best Local Similarity 47.1%; Pred. No. 89; 

Matches 8; Conservative 3; Mismatches 1; Indels 5; Gaps 1; 

Qy 2 LEMPVLP VEPFP 13 

I I : I : II : I I I 

Db 17 LEVPILPTHHLLIHPFP 33 



RESULT 12 



US-10-012-542-456 

Sequence 456, Application US/10012542 
Publication No. US20030044851A1 
GENERAL INFORMATION: 
APPLICANT: Ruben et al. 

TITLE OF INVENTION: 94 Human Secreted Proteins 
FILE REFERENCE: PZ029P1 

CURRENT APPLICATION NUMBER: US/ 10/ 0 12 , 542 
CURRENT FILING DATE: 2001-12-12 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/461,325 
PRIOR FILING DATE: EARLIER FILING DATE: 1999-12-14 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,507 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,508 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,509 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/089,510 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-16 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/090,112 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 60/090,113 
PRIOR FILING DATE: EARLIER FILING DATE: 1998-06-22 
NUMBER OF SEQ ID NOS : 532 
SOFTWARE: PatentlnVer. 2.0 
SEQ ID NO 456 
LENGTH: 86 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-012-542-456 

Query Match 51.2%; Score 41.5; DB 14; Length 86; 

Best Local Similarity 47.1%; Pred. No. 89; 

Matches 8; Conservative 3; Mismatches 1; Indels 5; Gaps 1; 

Qy 2 LEMPVLP VEPFP 13 

I I : I : I I : M I 

Db 17 LEVPILPTHHLLIHPFP 33 



RESULT 13 

US-10-424-599-204279 

; Sequence 204279, Application US/10424599 

; Publication No. US20040031072A1 

; GENERAL INFORMATION: 

; APPLICANT: La Rosa Thomas J 

; APPLICANT: Kovalic David K 

; APPLICANT: Zhou Yihua 

; APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
; FILE REFERENCE: 3 8-2 1 ( 53223 ) B 
; CURRENT APPLICATION NUMBER: US/ 1 0/424 , 5 99 
; CURRENT FILING DATE: 2003-04-28 
; NUMBER OF SEQ ID NOS: 285684 
; SEQ ID NO 204279 



; LENGTH : 8 8 

TYPE: PRT 
; ORGANISM: Glycine max 
; FEATURE : 

NAME/KEY: unsure 

LOCATION: (1) . . (88) 
; OTHER INFORMATION: unsure at all Xaa locations 

FEATURE: 

; OTHER INFORMATION: Clone ID: PAT_MRT3 84 7_26491C . 1 . pep 
US-10-424-599-204279 



Query Match 50.6%; 
Best Local Similarity 45,5%; 
Matches 5; Conservative 



Score 41; DB 12; Length 88; 
Pred. No, l.le+02; 
6; Mismatches 0; Indels 0; Gaps 



Qy 1 DLEMPVLPVEP 11 

: : : : I I : I : M 
Db 77 NIDIPVIPIEP 87 



RESULT 14 

US-10-276-774-1842 

; Sequence 1842, Application US/10276774 

; Publication No. US2 0040053245A1 

; GENERAL INFORMATION: 

; APPLICANT: Hyseq, Inc. 

APPLICANT: Tang, Y, Tom et al 
; TITLE OF INVENTION: No. US20040053245Alel Nucleic Acids and Polypeptides 
; FILE REFERENCE: 21272-030 

; CURRENT APPLICATION NUMBER: US/ 10/276, 774 
; CURRENT FILING DATE: 2002-11-18 

PRIOR APPLICATION NUMBER: 09/560,875 
; PRIOR FILING DATE: 2000-04-27 
; PRIOR APPLICATION NUMBER: 09/496,914 
; PRIOR FILING DATE: 2000-02-03 
; NUMBER OF SEQ ID NOS : 2700 
; SOFTWARE: Custom 
; SEQ ID NO 1842 
LENGTH: 101 
TYPE : PRT 
; ORGANISM: Homo sapiens 
; FEATURE: 

; NAME/KEY: misc_feature 
LOCATION: (1)...{101) 

OTHER INFORMATION: Xaa = any amino acid or nothing 
US-10-276-774-1842 

Query Match 50.6%; Score 41; DB 12; Length 101; 

Best Local Similarity 38.5%; Pred, No. 1.2e+02; 

Matches 5; Conservative 6; Mismatches 2; Indels 0; Gaps 



Qy 1 DLEMPVLPVEPFP 13 

I : : : I : I : : M 
Db 57 DMQVPSVPIQQFP 69 



RESULT 15 



US-1 0-42 4-5 99-223 8 03 

Sequence 223803, Application US/10424599 
Publication No. US20040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 
APPLICANT: Cao Yongwei 

TITLE OF INVENTION: Soy Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21(53223)3 
CURRENT APPLICATION NUMBER: US/ 10/ 42 4 , 59 9 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 285684 
SEQ ID NO 223803 
LENGTH: 113 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE: 

OTHER INFORMATION: Clone ID: PAT_MRT3847_44123C . 1 , pep 
US-10-424-599-2238 03 

Query Match 50.6%; Score 41; DB 12; Length 113; 

Best Local Similarity 70.0%; Pred. No. 1.4e+02; 

Matches 7; Conservative 1; Mismatches 2; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFPF 14 

I : II I I I I 
Db 94 PLGPVSPFPF 103 



Search completed: August 24, 2004, 16:41:20 
Job time : 56.291 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: August 24, 2004, 15:23:00 ; Search time 46.3433 Seconds 

(without alignments) 
102,124 Million cell updates/sec 

Title: US- 09-64 1-8 01-5 

Perfect score: 81 

Sequence: 1 DLEMPVLPVEPFPFV 15 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1017041 seqs, 315518202 residues 

Total number of hits satisfying chosen parameters: 1017041 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : SPTREMBL 25:^ 



1: 




sp archea:* 


2: 




sp bacteria:^ 


3: 




sp fungi : * 


4 




sp human:* 


5 




sp invertebrate:* 


6 




sp mammal : * 


7 




sp_mhc : * 


8 




sp organelle:* 


9 




sp_phage : * 


10: 


sp plant:* 


11 


sp rodent:* 


12 


sp virus:* 


13 


sp vertebrate:* 


14 


sp unclassified:* 


15 


sp rvirus : * 


16 


: sp bacteriap:* 


17 


: sp archeap:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


48 


59 . 


3 


232 


4 


Q8NAJ2 


Q8naj2 homo sapien 


2 


47 


58 . 


0 


1766 


3 


P78584 


P785 84 aspergillus 


3 


46 


56. 


8 


180 


17 


Q8TW43 


Q8tw43 methanopyru 


4 


45 


55. 


6 


79 


11 


Q9 9ME4 


Q99me4 rattus norv 


5 


45 


55 . 


6 


80 


10 


Q8LF16 


Q81fl6 arabidopsis 


6 


45 


55 . 


6 


91 


10 


Q8S8R5 


Q8s8r5 arabidopsis 


7 


45 


55. 


6 


219 


5 


Q22140 


Q22140 caenorhabdi 


8 


45 


55. 


6 


446 


16 


Q9KTH8 


Q9kth8 vibrio chol 


9 


45 


55 . 


6 


557 


11 


Q99L21 


Q99121 mus musculu 


10 


45 


55 . 


6 


599 


11 


Q8C4X8 


Q8c4x8 mus musculu 


11 


45 


55 . 


6 


599 


11 


Q8C3K9 


Q8c3k9 mus musculu 


12 


45 


55 . 


6 


599 


11 


Q8C2N3 


Q8c2n3 mus musculu 


13 


45 


55. 


6 


607 


11 


Q8BSP2 


Q8bsp2 mus musculu 


14 


45 


55 . 


6 


619 


4 


Q9H5Q7 


Q9h5q7 homo sapien 


15 


45 


55. 


6 


668 


4 


Q9H5C7 


Q9h5c7 homo sapien 


16 


45 


55 , 


6 


772 


5 


Q9BLH1 


Q9blhl bombyx mori 


17 


45 


55 . 


6 


902 


16 


Q8G762 


Q8g762 bifidobacte 


18 


45 


55 . 


6 


1307 


4 


Q9C093 


Q9c093 homo sapien 


19 


45 


55 . 


, 6 


1744 


11 


Q9R095 


Q9r095 rattus norv 


20 


43 


53. 


. 1 


388 


5 


Q9XTX7 


Q9xtx7 caenorhabdi 


21 


43 


53 . 


, 1 


421 


17 


028350 


028350 archaeoglob 




43 


53 , 


, 1 


444 


10 


Q9SR06 


Q9sr06 arabidopsis 


23 


42 . 5 


52 , 


, 5 


681 


10 


Q9LRV1 


Q91rvl arabidopsis 


24 


42 


51. 


, 9 


218 


3 


Q03778 


Q03778 saccharomyc 


25 


42 


51 . 


, 9 


232 


5 


Q9VYG4 


Q9vyg4 drosophila 


26 


42 


51 , 


. 9 


260 


10 


Q94AV8 


Q94av8 arabidopsis 


27 


42 


51 , 


, 9 


265 


16 


Q889N4 


Q88 9n4 pseudomonas 


28 


42 


51, 


. 9 


367 


5 


Q22069 


Q22069 caenorhabdi 


29 


42 


51 . 


. 9 


448 


16 


Q8YXW8 


Q8yxw8 anabaena sp 


30 


42 


51, 


. 9 


468 


5 


Q9NF32 


Q9nf32 drosophila 


31 


42 


51, 


. 9 


469 


5 


Q9W5D6 


Q9w5d6 drosophila 


32 


42 


51 , 


. 9 


473 


16 


Q8YMB2 


Q8ymb2 anabaena sp 


33 


42 


51 , 


. 9 


529 


10 


Q8 4WU7 


Q84wu7 arabidopsis 


34 


42 


51 - 


. 9 


607 


11 


Q8CAZ3 


Q8caz3 mus musculu 


35 


42 


51 , 


. 9 


696 


8 


Q85WS8 


Q85ws8 pinus korai 


36 


42 


51 . 


. 9 


1312 


3 


Q8WZV2 


Q8wzv2 neurospora 


37 


42 


51 


. 9 


1504 


2 


Q9ZGA6 


Q9zga6 streptomyce 


38 


41 


50 


. 6 


174 


10 


Q8RWM7 


Q8rwm7 arabidopsis 


39 


41 


50 


. 6 


194 


16 


Q8 8VZ0 


Q88vz0 lactobacill 


40 


41 


50 


. 6 


213 


10 


Q9LVY6 


Q91vy6 arabidopsis 


41 


41 


50 


.6 


309 


16 


Q89HQ5 


Q89hq5 bradyrhizob 


42 


41 


50 


.6 


337 


15 


Q7SMJ4 


Q7smj4 human immun 


43 


41 


50 


. 6 


395 


16 


Q98D83 


Q98d83 rhizobium 1 


44 


41 


50 


.6 


397 


16 


Q82PA6 


Q82pa6 streptomyce 


45 


41 


50 


. 6 


417 


2 


Q9KWC4 


Q9kwc4 agrobacteri 



ALIGNMENTS 



RESULT 1 
Q8NAJ2 

ID Q8NAJ2 PRELIMINARY 
AC Q8NAJ2; 

DT Ol-OCT-2002 (TrEMBLrel. 
DT Ol-OCT-2002 (TrEMBLrel. 
DT Ol-OCT-2002 (TrEMBLrel. 



PRT; 232 AA. 
22, Created) 

22;. Last sequence update) 
22, Last annotation update) 



DE Hypothetical protein FLJ35269. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Prostate; 

RA Ishibashi T., Kanehori K. , Yosida M 

RA Hotuta T., Hiraoka S., Murakawa K., 

RA Watanabe M., Fujimori K., Tanai H., 

RA Sugiyama T., Irie R. , Otsuki T,, Sato H., Wakamatsu A., Ishii S., 

RA Yamamoto J., Isono Y., Kawai-Hio Y., Saito K., Nishikawa T., 

RA Kimura K. , Matsuo K., Nakamura Y., Sekine M., Kikuchi H., Kanda K 

RA Wagatsuma M. , Takahashi-Fuj ii A,, Oshima A., Sugiyama A 

RA Suzuki Y., Sugano S., Nagahari K., Masuho Y., Nagai K. , 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (JUL-2002) to the EMBL/GenBank/DDB J databases 

DR EMBL; AK092588; BAC03921.1; 

KW Hypothetical protein. 

SQ SEQUENCE 232 AA; 25113 MW; 36707CBA84594AC8 CRC64; 



, Watanabe S., Ishida S., Ono Y., 
Takiguchi S.^ Kusano J., 
Ishida M., Yamashita H., Chiba Y. 



, Kawakami B . 
Isogai T . ; 



Query Match 59.3%; 
Best Local Similarity 66.7%; 
Matches 8; Conservative 



Score 48; DB 4; Length 232; 
Fred. No. 7.6; 
2; Mismatches 2; Indels 



0; Gaps 



0; 



Qy 



Db 



3 EMPVLPVEPFPF 14 
: I I I I I I : I I 
209 KFPVLPVHPWPF 22 0 



RESULT 2 
P78584 

ID P78584 PRELIMINARY ; 

AC P78584; 

DT 01~MAY-1997 (TrEMBLrel. 03, 

DT Ol-MAY-1997 (TrEMBLrel. 03, 

DT Ol-OCT-2003 (TrEMBLrel. 25, 

DE Polyketide synthase PKSL2 . 

GN PKSL2. 

OS Aspergillus parasiticus. 

OC Eukaryota; Fungi; Ascomycota; Pezizomycotina ; Eurotiomycetes ; 

OC Eurotiales; Trichocomaceae ; mitosporic Trichocomaceae ; Aspergillus. 

OX NCBI__TaxID=5067 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN^NRRL 2999; 

RX MEDLINE=98268975; PubMed=960384 9 ; 

RA Feng G.H., Leonard T.J.; 

RT "Culture conditions control expression of the genes for aflatoxin and 

RT sterigmatocystin biosynthesis in Aspergillus parasiticus and A. 

RT nidulans . "; 

RL Appl. Environ. Microbiol. 64:2275-2277(1998). 

DR EMBL; U52151; AAC23536.1; 

DR GO; GO: 0016740; F: transferase activity; lEA. 

DR GO; GO: 0006633; P: fatty acid biosynthesis; lEA. 

DR GO; GO: 0008152; P :metabolism; lEA. 



PRT; 1766 AA. 
Created) 

Last sequence update) 
Last annotation update) 



DR InterPro; IPR001227; Ac_trans . 

DR InterPro; IPR000794; Ketoacyl_synth . 

DR InterPro; IPR006163; Pp_bind. 

DR Pfam; PF00698; Acyl_transf; 1. 

DR Pfam; PF00109; ketoacyl-synt ; 1. 

DR Pfam; PF02801; ketoacyl-synt_C; 1. 

DR Pfam; PF00550; pp-binding; 1. 

DR PROSITE; PS50075; ACP_DOMAIN; 1. 

DR PROSITE; PS 0 0606; B_KETOACYL_SYNTHASE; 1. 

KW Phosphopantetheine; Transferase . 

SQ SEQUENCE 1766 AA; 192068 MW; E2 0C4BF2 6F6067 IE CRC64; 



Query Match 58.0%; Score 47; DB 3; Length 1766; 

Best Local Similarity 60.0%; Pred. No. 89; 

Matches 9; Conservative 2; Mismatches 4; Indels 0; 



Gaps 



0; 



Qy 1 DLEMPVLPVEPFPFV 15 

I I I M I I I : : I 

Db 134 0 DLEMPVLPLATMKYV 1354 



RESULT 3 
Q8TW43 

ID Q8TW43 PRELIMINARY; PRT; 180 AA. 

AC Q8TW4 3; 

DT Ol-JUN-2002 (TrEMBLrel. 21, Created) 

DT Ol-JUN-2002 (TrEMBLrel. 21, Last sequence update) 

DT Ol-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Uncharacterized membrane protein. 

GN MK1194. 

OS Methanopyrus kandleri. 

OC Archaea; Euryarchaeota ; Methanopyri; Methanopyrales ; Methanopyraceae; 

OC Methanopyrus . 

OX NCBI_TaxID=232 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=AV19 / DSM 6324 / JCM 9639; 

RX MEDLINE=21927647; PubMed=11930014 ; 

RA Slesarev A. I . , Mezhevaya K.V., Makarova K.S., Polushin N.N., 

RA Shcherbinina O.V., Shakhova V.V., Belova G.I., Aravind L., 

RA Natale D.A., Rogozin I.B., Tatusov R.L., Wolf Y.I., Stetter K.O., 

RA Malykh A.G., Koonin E.V., Kozyavkin S.A.; 

RT "The complete genome of hyperthermophile Methanopyrus kandleri AV19 

RT and monophyly of archaeal methanogens . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:4644-4 64 9(2 002). 

DR EMBL; AE010411; AAM02407.1; -. 

KW Complete proteome. 

SQ SEQUENCE 180 AA; 19963 MW; 8 935B0CADA923F7 5 CRC64; 

Query Match 56.8%; Score 46; DB 17; Length 180; 

Best Local Similarity 71.4%; Pred. No. 12; 

Matches 10; Conservative 0; Mismatches 4; Indels 0; Gaps 



Qy 2 LEMPVLPVEPFPFV 15 

II I I M I II I 
Db 29 LECSVLPVPPEPFV 42 



RESULT 4 
Q99ME4 

ID Q99ME4 PRELIMINARY; PRT; 7 9 AA. 

AC Q99ME4; 

DT Ol-JUN-2001 (TrEMBLrel. 17, Created) 

DT Ol-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT Ol-JUN-2001 (TrEMBLrel. 17, Last annotation update) 

DE Thyroid hormone-response protein-1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Brain; 

RA Xie C, Yang Y. , Yang Y., Cai D., Cheng G. , Li G., Luo M. ; 

RT "Rat thyroid hormone-response gene-1 cloned from brain."; 

RL Submitted (FEB-2001) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AF348365; AAK15769.1; -. 

SQ SEQUENCE 79 AA; 8994 MW; F8 842 7 8 152 8 33C09 CRC64; 

Query Match 55.6%; Score 45; DB 11; Length 79; 

Best Local Similarity 63.6%; Pred. No. 7.7; 

Matches 7; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 



Qy 3 EMPVLPVEPFP 13 

I : I I I I : : I I 
Db 8 EVPVLPLQPLP 18 



RESULT 5 
Q8LF16 

ID Q8LF16 PRELIMINARY; PRT; 80 AA. 

AC Q8LF16; 

DT Ol-OCT-2002 (TrEMBLrel. 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Hypothetical protein. 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; rosids; 

OC eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

OX NCBI_TaxID=3702; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Haas B.J., Volfovsky N., Town CD., Troukhan M. , Alexandrov N. , 

RA Feldmann K.A. , Flavell R.B., White O., Salzberg S.L.; 

RT "Full-length messenger RNA sequences greatly improve genome 

RT annotation."; 

RL Genome Biol. 0:0-0(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Brover V., Troukhan M. , Alexandrov N., Lu Y.-P., Flavell R. , 

RA Feldmann K. ; 

RT "Full-Length cDNA from Arabidopsis thaliana."; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases. 



DR EMBL; AY085097; AAM61651.1; 
KW Hypothetical protein. 

SQ SEQUENCE 80 AA; 8671 MW; BB1EF444B1A34E81 CRC64; 

Query Match 55.6%; Score 45; DB 10; Length 80; 

Best Local Similarity 77.8%; Pred. No. 7.8; 

Matches 7; Conservative 1; Mismatches 1; Indels 0; Gaps 

Qy 5 PVLPVEPFP 13 

I I : I I I I I 
Db 38 PVIPTEPFP 46 



RESULT 
Q8S8R5 
ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RC 
RA 
RA 
RA 
RT 
RL 
DR 
SQ 



(TrEMBLrel . 21, 
(TrEMBLrel. 21, 
(TrEMBLrel. 21, 



Created) 

Last sequence update) 
Last annotation update) 



Q8S8R5 PRELIMINARY; PRT; 91 AA. 

Q8S8R5; 
Ol-JUN-2002 
Ol-JUN-2002 
Ol-JUN-2002 
Expressed protein. 
AT2G02515. 

Arabidopsis thaliana (Mouse-ear cress) . 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; rosids; 
eurosids II; Brassicales; Brassicaceae; Arabidopsis. 
NCBI_TaxID=3702; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=cv. Columbia; 

Lin X., Kaul S., Town CD., Benito M.-I., Creasy 
Wu D., Maiti R., Ronning CM., Koo H., Fujii CY. 
Barns tead M.E., Bowman CL., White O., Nierman W. 
"Arabidopsis thaliana chromosome 2 BAG T8K22 genomic sequence. 
Submitted (FEB-2002) to the 
EMBL; AC004136; AAM14912.1; 



.H. , Haas B. J. , 
Utterback T.R. , 
. , Eraser CM. ; 



SEQUENCE 91 AA; 987 0 MW; 



EMBL/GenBank/DDBJ databases. 
CF45D5B3FB66192D CRC64 ; 



Query Match 55.6%; 
Best Local Similarity 77.8%; 
Matches 7; Conservative 



Score 45; DB 10; Length 91; 
Pred. No. 8.9; 
1; Mismatches 1; Indels 



0; Gaps 



Qy 



Db 



49 



PVLPVEPFP 13 

I I : I INI 
PVIPTEPFP 57 



RESULT 7 
Q22140 

ID Q22140 PRELIMINARY 
AC Q22140; 

DT Ol-NOV-1996 (TrEMBLrel. 
DT Ol-NOV-1996 (TrEMBLrel. 
DT 01-JUN-2003 (TrEMBLrel. 
DE T04A8.11 protein. 
GN T04A8 . 11. 

OS Caenorhabditis elegans . 



PRT; 219 AA. 
01, Created) 

01, Last sequence update) 
24, Last annotation update) 



OC Eukaryota; Metazoa; Nematoda; Chromadorea ; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis . 

OX NCBI_TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Palmer S.; 

RL Submitted (AUG-1994) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed-985191 6 ; 

RA none; 

RT "Genome sequence of the nematode C.elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

DR EMBL; Z35663; CAA84730.1; 

DR PIR; T24429; T24429. 

DR WormPep; T04A8.11; CE01066. 

DR GO; GO: 0005622; C: intracellular ; lEA. 

DR GO; GO: 0005840; C:ribosome; TEA. 

DR GO; GO: 0003735; F: structural constituent of ribosome; lEA. 

DR GO; GO:0006412; P:protein biosynthesis; lEA. 

DR InterPro; IPR000114; Ribosomal_L16 . 

DR Pfam; PF00252; Ribosomal_L16; 1. 

DR PRINTS; PR00060; RIBOSOMALL16 . 

SQ SEQUENCE 219 AA; 25360 MW; E22A1E0A573C3FDE CRC64; 



Query Match 55.6%; 
Best Local Similarity 63.6%; 
Matches 7; Conservative 



Score 45; DB 5; Length 219; 
Pred. No. 22; 
3; Mismatches 1; Indels 



0; Gaps 



Qy 

Db 



2 LEMPVLPVEPF 12 

I : : I I : I III 
21 LKLPVMPAEPF 31 



RESULT 8 




Q9KTH8 




ID 


Q9KTH8 PRELIMINARY; PRT; 446 AA. 




AC 


Q9KTH8; 




DT 


Ol-OCT-2000 (TrEMBLrel. 15, Created) 




DT 


Ol-OCT-2000 (TrEMBLrel. 15, Last sequence update) 




DT 


Ol-JUN-2003 (TrEMBLrel. 24, Last annotation update) 




DE 


CapK protein, putative. 




GN 


VC0924 . 




OS 


Vibrio cholerae. 




OC 


Bacteria; Proteobacteria; Gammaproteobacteria; Vibrionales; 




OC 


Vibrionaceae ; Vibrio . 




OX 


NCBI TaxID=666; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN=E1 Tor N16961 / Serotype 01; 




RX 


MEDLINE=2 04 06833; PubMed=10 952301; 




RA 


Heidelberg J.F., Eisen J.A. , Nelson W.C, Clayton R.A. , Gwinn M.L., 


RA 


Dodson R.J., Haft D.H., Hickey E.K., Peterson J.D., Umayam 


L.A. , 


RA 


Gill S.R., Nelson K.E., Read T.D., Tettelin H., Richardson 


D., 


RA 


Ermolaeva M.D., Vamathevan J., Bass S., Qin H., Dragoi I., 


Sellers P 


RA 


McDonald L., Utterback T., Fleischmann R.D., Nierman W.C, 


White 0. , 



RA Salzberg Smith H.O., Colwell R.R., Mekalanos J. J., Venter J.C, 

RA Fraser CM. ; 

RT "DNA sequence of both chromosomes of the cholera pathogen Vibrio 

RT cholerae."; 

RL Nature 406:477-483(2 000). 

DR EMBL; AE004175; AAF94086.1; -. 

DR PIR; H82264; H82264. 

DR TIGR; VC0924; 

KW Complete proteome. 

SQ SEQUENCE 446 AA; 50713 MW; 85BDAC396E2EC45D CRC64; 

Query Match 55.6%; Score 45; DB 16; Length 446; 

Best Local Similarity 53.3%; Pred. No. 45; 

Matches 8; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPFV 15 

M I : I I : I M : : 
Db 7 6 DLEVPNLELEAFPYL 9 0 



RESULT 9 
Q99L21 



ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RL 
DR 
KW 
SQ 



Q99L21 
Q99L21; 
01~JUN-2001 
Ol-JUN-2 001 
Ol-OCT-2002 



PRELIMINARY; 



PRT; 



557 AA. 



Created) 

Last sequence update) 
Last annotation update) 
6. 



(TrEMBLrel. 17, 
(TrEMBLrel. 17, 
(TrEMBLrel. 22, 
Similar to hypothetical protein 384D8_ 
0610010J20RIK. 
Mus mus cuius (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 
Strausberg R. ; 

Submitted (FEB-2001) to the EMBL/GenBank/DDBJ databases. 
EMBL; BC003900; AAH03900.1; 
Hypothetical protein. 

SEQUENCE 557 AA; 63153 MW; 742CD8 1FEA32A7C6 CRC64; 



Query Match 55.6%; Score 45; DB 11; Length 557; 

Best Local Similarity 77.8%; Pred. No. 57; 

Matches 7; Conservative 2; Mismatches 0; Indels 0; 

Qy 5 PVLPVEPFP 13 

M : I I I I : I 
Db 154 PWPVEPYP 162 



Gaps 



0; 



RESULT 10 
Q8C4X8 

ID Q8C4X8 PRELIMINARY 
AC Q8C4X8; 

DT Ol-MAR-2003 (TrEMBLrel. 
DT Ol-MAR-2003 (TrEMBLrel. 
DT Ol-MAR-2003 (TrEMBLrel. 



PRT; 599 AA. 
23, Created) 

23, Last sequence update ) 
23, Last annotation update) 



DE Hypothetical protein. 

OS Mus musculus (Mouse) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Cerebellum; 

RX MEDLINE-22354683; PubMed=124 668 5 1 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 

RT 60,770 full-length cDNAs."; 

RL Nature 420:563-573(2002). 

DR EMBL; AK080447; BAC37919.1; -. 

KW Hypothetical protein. 

SQ SEQUENCE 599 AA; 68024 MW; 8591FEA8 9E95ECF3 CRC64; 



Query Match 55.6%; 
Best Local Similarity 77.8%; 
Matches 7; Conservative 



Score 45; DB 11; Length 599; 
Pred. No. 61; 
2; Mismatches 0; Indels 0; 



Gap 



Qy 



Db 



5 PVLPVEPFP 13 
I I : I M I : I 
196 PWPVEPYP 204 



RESULT 11 
Q8C3K9 

ID Q8C3K9 PRELIMINARY; PRT; 599 AA. 

AC Q8C3K9; 

DT Ol-MAR-2003 (TrEMBLrel. 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Kidney ; 

RX MEDLINE=22354 68 3; PubMed=12 4 6685 1 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 

RT 60,770 full-length cDNAs,"; 

RL Nature 420:563-573(2 002). 

DR EMBL; AK085584; BAC39479.1; -. 

KW Hypothetical protein. 

SQ SEQUENCE 599 AA; 68022 MW; 4332A8427D33B8DA CRC64; 

Query Match 55.6%; Score 45; DB 11; Length 599; 

Best Local Similarity 77.8%; Pred. No. 61; 

Matches 7; Conservative 2; Mismatches 0; Indels 0; Gap 



Qy 



5 PVLPVEPFP 13 



Db 



I I : I I I I : I 
196 PWPVEPYP 2 04 



RESULT 12 
Q8C2N3 

ID Q8C2N3 PRELIMINARY; PRT; 599 AA. 

AC Q8C2N3; 

DT Ol-MAR-2003 (TrEMBLrel. 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=NOD; TISSUE=Thymus ; 

RX MEDLINE=22354683; PubMed=12466851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 

RT 60,770 full-length cDNAs."; 

RL Nature 42 0:563-573(2 0 02). 

DR EMBL; AK088294; BAC40265.1; -. 

KW Hypothetical protein. 

SQ SEQUENCE 599 AA; 68023 MW; 4 332A8427EDA9AB8 CRC64; 



Query Match 55.6%; 
Best Local Similarity 77.8%; 
Matches 7; Conservative 



Score 45; DB 11; Length 599; 
Pred. No. 61; 
2; Mismatches 0; Indels 



0; Gaps 



Qy 

Db 



5 PVLPVEPFP 13 

I I : I I I I : I 
196 PWPVEPYP 2 04 



RESULT 
Q8BSP2 
ID 
AC 
DT 
DT 
DT 
DE 
OS 



13 



Created) 

Last sequence update) 
Last annotation update) 



Q8BSP2 PRELIMINARY; PRT; 607 AA. 

Q8BSP2; 

Ol-MAR-2003 (TrEMBLrel. 23, 
Ol-MAR-2003 (TrEMBLrel. 23, 
Ol-MAR-2003 (TrEMBLrel. 23, 
Hypothetical protein. 
Mus musculus (Mouse) . 
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
OX NCBI__TaxID=100 90; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Forelimb ; 
RX MEDLINE=22354683; PubMed=12 4 6685 1 ; 
RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 



RT 60,77 0 full-length cDNAs."; 

RL Nature 420:563-573(2002) . 

DR EMBL; AK031135; BAC27270.1; 

KW Hypothetical protein. 

SQ SEQUENCE 607 AA; 68945 MW; 616CA03BDD647852 CRC64; 

Query Match 55.6%; Score 45; DB 11; Length 607; 

Best Local Similarity 77,8%; Pred. No. 62; 

Matches 7; Conservative 2; Mismatches 0; Indels 0; Gaps 0; 

Qy 5 PVLPVEPFP 13 

I I : I I I I : I 
Db 204 PWPVEPYP 212 



AC Q9H5Q7; 

DT Ol-MAR-2001 (TrEMBLrel. 16, Created) 

DT Ol-MAR-2001 (TrEMBLrel, 16, Last sequence update) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Hypothetical protein FLJ23164. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates ; Catarrhini ; Hominidae; Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Lung; 

RA Kawakami T., Noguchi S., Itoh T., Shigeta K. , Senba T., Matsumura K. , 

RA Nakajima Y., Mizuno T., Morinaga M. , Tanigami A., Fujiwara T., Ono T., 

RA Yamada K. , Fujii Y., Ozaki K., Hirao M. , Ohmori Y. , Ota T., Suzuki Y., 

RA Obayashi M. , Nishi T., Shibahara T., Tanaka T., Nakamura Y., 

RA Isogai T., Sugano S.; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (AUG-2000) to the EMBL/GenBank/DDBJ databases, 

DR EMBL; AK026817; BAB15563.1; -. 

KW Hypothetical protein. 

SQ SEQUENCE 619 AA; 70742 MW; 6D5395F9BA12 65AF CRC64; 

Query Match 55.6%; Score 45; DB 4; Length 619; 

Best Local Similarity 50.0%; Pred. No. 64; 

Matches 7; Conservative 4; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 DLEMPVLPVEPFPF 14 

I : : : I I : I I II 
Db 386 DIKIPENPLEPLPF 399 



RESULT 14 

Q9H5Q7 

ID Q9H5Q7 



PRELIMINARY; 



PRT; 



619 AA. 



RESULT 15 
Q9H5C7 

ID Q9H5C7 PRELIMINARY; 
AC Q9H5C7; 

DT Ol-MAR-2001 (TrEMBLrel. 16, 
DT Ol-MAR-2 001 (TrEMBLrel. 16, 
DT Ol-OCT-2 002 (TrEMBLrel. 22, 



Created) 

Last sequence update) 
Last annotation update) 



PRT; 



668 7^, 



DE Hypothetical protein FLJ23577. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Lung; 

RA Kawakami T., Noguchi S., Itoh T., Shigeta K., Senba T., Matsumura K. 

RA Nakajima Y. , Mizuno T., Morinaga M. , Tanigami A., Fujiwara T., Ono T 

RA Yamada K., Fujii Y., Ozaki K.^ Hirao M., Ohmori Y., Ota T., Suzuki Y 

RA Obayashi M. , Nishi T., Shibahara T.^ Tanaka T., Nakamura Y., 

RA Isogai T., Sugano S.; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (AUG-2000) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AK027230; BAB15700.1; -. 

KW Hypothetical protein. 

SQ SEQUENCE 668 AA; 76854 MW; 3F4 6A3EE71940463 CRC64; 

Query Match 55.6%; Score 45; DB 4; Length 668; 

Best Local Similarity 50.0%; Pred. No. 69; 

Matches 7; Conservative 4; Mismatches 3; Indels 0; Gaps 

Qy 1 DLEMPVLPVEPFPF 14 

I : : : I I : I I I I 

Db 435 DIKIPENPLEPLPF 448 



Search completed: August 24, 2004, 15:50:44 
Job time : 54.3433 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: August 24, 2004, 14:57:04 ; 



Search time 8.0597 Seconds 
(without alignments) 
96.908 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-641-801-5 
81 

1 DLEMPVLPVEPFPFV 15 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



141681 



Database 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


43 


53.1 


1161 


1 


KCH2 RABIT 


Q8wny2 


oryctolagus 


2 


42 


51. 9 


696 


1 


RPOC_PINTH 


P52733 


pinus thunb 


3 


40 


49.4 


438 


1 


RGSB_MOUSE 


Q9z2hl 


mus musculu 


4 


40 


49.4 


491 


1 


G6PD_RHIME 


Q9z3s2 


rhizobium m 


5 


40 


49. 4 


509 


1 


DNAA_MYCPA 


Q91717 


mycobacteri 


6 


39 


48 . 1 


207 


1 


YIOR CVBF 


P22654 


bovine coro 


7 


39 


48.1 


333 


1 


YE35^PYRAB 


Q9uys2 


pyrococcus 


8 


39 


48.1 


438 


1 


MPK5_HUMAN 


Q13163 


homo sapien 


9 


39 


48.1 


467 


1 


IRF6_HUMAN 


014896 


homo sapien 


10 


39 


48.1 


467 


1 


IRF6_M0USE 


P97431 


mus musculu 


11 


39 


48.1 


526 


1 


ACHl YEAST 


P32316 


saccharomyc 


12 


39 


48.1 


540 


1 


GRBE_HUMAN 


Q14449 


homo sapien 


13 


39 


48.1 


576 


1 


LE12_RALS0 


Q8xsz5 


ralstonia s 


14 


39 


48.1 


676 


1 


HUTU_HUMAN 


Q96n76 


homo sapien 


15 


39 


48.1 


712 


1 


PPK SYNPX 


Q7u3d7 


synechococc 


16 


39 


48.1 


739 


1 


PPK MYCLE 


033127 


mycobacteri 


17 


39 


48.1 


742 


1 


PPK MYCTU 


P95111 


mycobacteri 



18 


39 


48 . 


1 


1020 


1 


VP 3 4 CANAL 


Q92213 


Candida alb 


19 


38 . 5 


47 . 


5 


302 


1 


CASH MACEU 


P28550 


macropus eu 


20 


38 . 5 


47 . 


5 


1144 


1 


FLNC MOUSE 


Q8vhx6 


mus musculu 


21 


38.5 


47 . 


5 


2725 


1 


FLNC HUMAN 


Q14315 


homo sapien 


22 


38 


46 . 


9 


121 


1 


AMEL TACAC 


097647 


tachyglossu 


23 


38 


46 . 


9 


190 


1 


Y417_ARCFU 


029830 


archaeoglob 


24 


38 


46. 


, 9 


265 


1 


IHA_SHEEP 


P38440 


ovis aries 


25 


38 


46 . 


, 9 


477 


1 


MM03 HUMAN 


P08254 


homo sapien 


26 


38 


46 . 


, 9 


533 


1 


LCP2 HUMT^ 


Q13094 


homo sapien 


27 


38 


46, 


. 9 


582 


1 


HEMO OPSTA 


P43090 


opsanus tau 


28 


38 


46 . 


, 9 


636 


1 


DXS SYNLE 


Q9r6s7 


synechococc 


29 


38 


46 . 


, 9 


636 


1 


DXS SYNP7 


Q8gaa0 


synechococc 


30 


38 


46 , 


, 9 


1135 


1 


RBIj2 rat 


055081 


rattus norv 


31 


38 


46 . 


, 9 


2044 


1 


SIF2 DROME 


P91620 


drosophila 


32 


37 . 5 


46 ■ 


, 3 


865 


1 


CPN DROME 


Q02910 


drosophila 


33 


37 


45 . 


, 7 


167 


1 


SERO GALME 


076192 


galleria me 


34 


37 


45 . 


, 7 


239 


1 


PNUC SALTY 


P24520 


salmonella 




37 


45 . 


, 7 


272 


1 


ATP 6 BUCAP 


051878 


buchnera ap 


O vj 


37 


45 , 


, 7 


346 


1 


XYLD RHIME 


Q92int4 


rhizobium m 


"^7 


37 


45 . 


. 7 


357 


1 


RFAK~ECOLI 


P27242 


escherichia 


o o 


37 


45 . 


. 7 


417 


1 


Y943 METJA 


Q58353 


methanococc 


39 


37 


45 . 


, 7 


429 


1 


ISPG DEIRA 


Q9rxc9 


deinococcus 


40 


37 


45 . 


. 7 


446 


1 


PFES~PSEAE 


Q04804 


pseudomonas 


41 


37 


45, 


.7 


486 


1 


ENV_HTLV2 


P03383 


human t-cel 


42 


37 


45, 


.7 


738 


1 


PAP BOVIN 


P25500 


bos taurus 


43 


37 


45, 


.7 


738 


1 


PAP_MOUSE 


Q61183. 


mus musculu 


44 


37 


45, 


. 7 


1040 


1 


RIK1_SCHP0 


Q10426 


schizosacch 


45 


37 


45, 


. 7 


1159 


1 


KCH2 HUMAN 


Q12809 


homo sapien 



ALIGNMENTS 



RESULT 1 
KCH2_RABIT 

ID KCH2__RABIT STANDARD; PRT; 1161 AA. 

AC Q8WNY2; 002731; 019119; 097586; Q9TV06; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Potassium voltage-gated channel subfamily H member 2 (Ether-a-go-go 

DE related gene potassium channel 1) (ERGl) (RERG) (ra-erg) (Ether-a-go- 

DE go related protein 1) (Eag related protein 1) . 

GN KCNH2 OR ERG. 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=998 6; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2) . 

RA Witchel H.J., Hancox J.C., Levi A.J., Meech R.W. ; 

RT "RERG - rabbit ventricular ERG K+ channel subunit."; 

RL Submitted (JAN-1997) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE OF 411-571 FROM N.A. (ISOFORM 1/2). 

RX MEDLINE=97164 98 6; PubMed=90127 4 8 ; 

RA Wymore R.S., Gintant G.A., Wymore R.T., Dixon J.E., McKinnon D., 



RA Cohen I . S . ; 

RT "Tissue and species distribution of mRNA for the IKr-like K+ channel, 

RT erg."; 

RL Circ. Res. 80:2 61-2 68(1997). 

CC -!- FUNCTION: Pore- forming (alpha) subunit of voltage-gated inwardly 
CC rectifying potassium channel. Channel properties are modulated by 

CC cAMP and subunit assembly. Mediates the rapidly activating 

CC component of the delayed rectifying potassium current in heart 

CC (IKr) (By similarity) . 

CC -!- SUBUNIT: The potassium channel is probably composed of a homo- or 
CC heterotetrameric complex of pore-forming alpha subunits that can 

CC associate with modulating beta subunits. Heteromultimer with 

CC KCNH6/ERG2, KCNH7/ERG3, KCNEl and KCNE2 (By similarity) . 

CC -!~ SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; 

CC IsoId=Q8WNY2-l ; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=Q8WNY2-2; Sequence=VSP_00097 1 ; 

CC -!- TISSUE SPECIFICITY: Detected in heart, both in atrium and in left 
CC ventricle. 

CC -!- DOMAIN: The segment S4 is probably the voltage-sensor and is 
CC characterized by a series of positively charged amino acids at 

CC every third position. 

CC -!- PTM: Phosphorylated on serine and threonine residues (By 
CC similarity) . 

CC SIMILARITY: Belongs to the potassium channel family. H (Eag) 

CC subfamily. 

CC -!- SIMILARITY: Contains 1 cyclic nucleotide-binding domain. 

CC -!- SIMILARITY: Contains 1 PAS ( PER-ARNT-SIM) dimerization domain. 

CC SIMILARITY: Contains 1 PAS-as sociated C-terminal (PAC) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U87513; 7VAB68612.1; -. 

DR EMBL; AF068736; AAC99425.1; 

DR EMBL; AF105061; AAD39357.1; 

DR EMBL; U75212; AAC48723.1; -. 

DR InterPro; IPR000595; cNMP_binding . 

DR InterPro; IPR003967; Erg_channel . 

DR InterPro; IPR005821; Ion_trans . 

DR InterPro; IPR001622; K+channel_pore . 

DR InterPro; IPR005820; M+channel_nlg . 

DR InterPro; IPR001610; PAC. 

DR InterPro; IPR000700; PAS-assoc_C. 

DR InterPro; IPR000014; PAS_domain. 

DR Pfam; PF00027; cNMP_binding; 1. 

DR Pfam; PF00520; ion_trans; 1, 

DR Pfam; PF00785; PAC; 1. 

DR PRINTS; PR01470; ERGCHANNEL. 



DR 


SMART; SMOOlOO; cNMP; 1 






DR 


SMART; SM00086; PAC; 1. 






DR 


PROSITE; 


PS00888; 


CNMP_ 


BINDING_1 ; FALS E_NEG . 


DR 


PROSITE; 


PS00889; 


CNMP 


BINDING_2; FALSE NEG. 


DR 


PROSITE; 


PS50042; 


CNMP_ 


BINDING_3; 1. 


DR 


PROSITE; 


PS50112; 


PAS; 


1. 




DR 


PROSITE; 


PS50113; 


PAC; 


1. 




KW 


Transport 


; Ion transport; Ionic channel; Voltage-gated channel; 


KW 


Potassium 


L channel; Potassium; Potassium transport; Transmembrane; 


KW 


Phosphorylation; 


Glycoprotein; Multigene family; Alternative splicing 


FT 






4 05 




rYTOPT.A^MTP fPOTFNTTAT,^ 


FT 




4 06 


426 




'^F.nMFMT Si f POTF.NTT AT, ^ 


FT 


TR AM*^MFM 


453 


473 




SFHMFMT S9 ^ PDTFNTT AT, ^ 


FT 




4 74 


497 




rYTOPT.AClMTr ^POTFNTTAT,^ 


FT 


T"R AM^IMFM 


4 Q ft 

1 17 O 


SI ft 

vyJ J. O 




ciFf^MFMT ciT ^POTFMTTAT.^ 


FT 


TR AM*^MFM 


J o 


,j *± o 




<^Fr;MFMT *=?4 ^POTFNTTAT,^ 


FT 


nOMA TM 
JJWl^ir\XlN 


.J *1 


u *± y 




PYTHPT A ciMTf /POTFMTTAT^ 


FT 


TRAM^MFM 

J. ivrtlN Ol lUil 1 


550 


570 




•^IFnMFMT *^ S ^POTFNTTAT,^ 


FT 


DOMAIN 


614 


634 




SEGMENT H5 ( PORE- FORMING) (POTENTIAL). 


i: J. 


TRANSMEM 


641 


661 




SEGMENT S6 (POTENTIAL) . 


FT 


DOMAIN 


662 


1161 




CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


17 


88 




PAS. 


FT 


DOMAIN 


92 


144 




PAC. 


FT 


DOMAIN 


299 


302 




POLY-PRO. 


FT 


NP_BIND 


744 


861 




CNMP. 


FT 


CARBOHYD 


600 


600 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


69 


85 




Missing (in isoform 2), 


FT 










/FTId=VSP__000971 . 


FT 


CONFLICT 


411 


411 




V -> A (IN REF. 2) . 


FT 


CONFLICT 


445 


446 




PE -> TD (IN REF. 2) . 


FT 


CONFLICT 


553 


553 




L -> F (IN REF. 2) . 


FT 


CONFLICT 


561 


561 




L -> C (IN REF. 2) . 


SQ 


SEQUENCE 


1161 AA; 126961 


MW; 79B532B2FFBD9i\EB CRC64; 



Query Match 53.1%; Score 43; DB 1; Length 1161; 

Best Local Similarity 77.8%; Pred. No. 40; 

Matches 7; Conservative 1; Mismatches 1; Indels 0; Gaps 



Qy 5 PVLPVEPFP 13 

1:111 Ml 
Db 1095 PLLPVSPFP 1103 



RESULT 2 
RPOC_PINTH 

ID RPOC_PINTH STANDARD; PRT ; 696 AA. 

AC P52733; 

DT Ol-OCT-1996 (Rel. 34, Created) 

DT Ol-OCT-1996 (Rel. 34, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE DNA-directed RNA polymerase beta' chain (EC 2.7.7.6). 

GN RPOCl. 

OS Pinus thunbergii (Green pine) (Japanese black pine) . 
OG Chloroplast. 

OC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
OC Spermatophyta; Conif eropsida ; Coniferales; Pinaceae; Pinus. 
OX NCBI TaxID=3350; 



RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=95024047; PubMed=7 9378 93 ; 

RA Wakasugi T., Tsudzuki J., Ito S., Nakashima K., Tsudzuki T., 

RA Sugiura M. ; 

RT "Loss of all ndh genes as determined by sequencing the entire 

RT chloroplast genome of the black pine Pinus thunbergii . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 91:9794-97 98(1994). 

CC -!- FUNCTION: DNA-dependent RNA polymerase catalyzes the transcription 
CC of DNA into RNA using the four ribonucleoside triphosphates as 

CC substrates . 

CC CATALYTIC ACTIVITY: N nucleoside triphosphate = N diphosphate 4- 

CC {RNA} (N) . 

CC SUBUNIT: In chloroplas ts , the RNA polymerase is composed of four 

CC subunits : alpha, beta, beta', and beta''. 

CC SUBCELLULAR LOCATION: Chloroplast. 

CC -!- SIMILARITY: Belongs to the RNA polymerase beta' chain family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D17510; BAA23472.1; -. 

DR PIR; T07447; T07447. 

DR HSSP; Q9KWU6; IHQM. 

DR InterPro; IPR000722; RNA_pol_A. 

DR InterPro; IPR007080; RNA_pol_Rpbl_l . 

DR InterPro; I PRO 07 0 66; RNA_pol_Rpbl_3 . 

DR InterPro; IPR006592; RNA_polA_N. 

DR Pfam; PF04997; RNA_pol_Rpbl_l ; 1. 

DR Pfam; PF00623; RNA_pol__Rpbl_2 ; 1. 

DR Pfam; PF04983; RNA_pol_Rpbl_3 ; 1. 

DR SMART; SMOG 6 63; RPOLA_N; 1. 

KW Transferase; Transcription; DNA-directed RNA polymerase; Chloroplast. 

SQ SEQUENCE 696 AA; 79805 MW; 722B50492E077A63 CRC64; 

Query Match 51.9%; Score 42; DB 1; Length 696; 

Best Local Similarity 66.7%; Pred. No. 34; 

Matches 8; Conservative 1; Mismatches 3; Indels 0; Gaps 0; 

Qy 4 MPVLPVEPFPFV 15 

: I I I I II I I 
Db 285 LPVLPPEPRPIV 296 



RESULT 3 
RGSB_MOUSE 

ID RGSB_MOUSE STTVNDARD; PRT; 438 AA. 

AC Q9Z2H1; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Regulator of G-protein signaling 11 (RGSll) (Fragment) . 



GN RGSll. 

OS Mus rausculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10 0 90; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA He W. , Wensel T.G. ; 

RL Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases. 

CC FUNCTION: Inhibits signal transduction by increasing the GTPase 

CC activity of G protein alpha subunits thereby driving them into 

CC their inactive GDP-bound form (By similarity) . 

CC SUBUNIT: HETERODIMER WITH GBETA5 (BY SIMILARITY). 

CC SIMILARITY: Contains 1 RGS domain. 

CC -!- SIMILARITY: Contains 1 G protein gamma domain. 

CC SIMILARITY: Contains 1 DEP domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF061934; AAC70012.1; -. 

DR HSSP; P49795; ICMZ . 

DR MGD; MGI:1354739; Rgsll. 

DR InterPro; IPR000591; DEP. 

DR InterPro; IPR001770; G-gamma. 

DR InterPro; IPR000342; Regl__Gprotein . 

DR Pfam; PF00610; DEP; 1. 

DR Pfam; PF00631; G-gamma; 1. 

DR Pfam; PF00615; RGS; 1. 

DR PRINTS; PR01301; RGSPROTEIN. 

DR ProDom; PD001580; Regl_Gprotein; 1. 

DR SMART; SM00049; DEP; 1. 

DR SMART; SM00224; GGL; 1. 

DR SMART; SM00315; RGS; 1. 

DR PROSITE; PS50186; DEP; 1. 

DR PROSITE; PS50058; G_PROTEIN_GAMMA; FALSE_NEG. 

DR PROSITE; PS50132; RGS; 1. 

KW Signal transduction inhibitor. 

FT NON_TER 1 1 

FT DOMAIN 6 81 DEP. 

FT DOMAIN 193 254 G PROTEIN GAMMA-LIKE. 

FT DOMAIN 275 390 RGS. 

SQ SEQUENCE 438 7\A; 50430 MW; 5E7CF122CA8 43EA3 CRC64; 



Query Match 49.4%; Score 40; DB 1; Length 438; 

Best Local Similarity 50.0%; Pred. No. 43; 

Matches 9; Conservative 3; Mismatches 2; Indels 4; Gaps 1; 



Qy 2 LEMPVLPVE PFPFV 15 

II I : I : I MM: 
Db 390 LEEAVIPLETKRWPFPFL 407 



RESULT 4 
G6PD_RHIME 

ID G6PD_RHIME STTVNDARD; PRT; 4 91 AA. 

AC Q9Z3S2; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49) (G6PD) . 

GN ZWF OR R00704 OR SMC03070. 

OS Rhizobium meliloti ( Sinorhizobium meliloti) . 

OC Bacteria; Proteobacteria ; Alphaproteobacteria ; Rhizobiales; 

OC Rhizobiaceae ; Sinorhizobium/Ensif er group; Sinorhizobium. 

OX NCBI_TaxID=382; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9932 8 961; PubMed=104 0057 3 ; 

RA Willis L.B., Walker G.C.; 

RT "A novel Sinorhizobium meliloti operon encodes an alpha-glucosidase 

RT and a periplasmic-binding-protein-dependent transport system for 

RT alpha-glucosides . " ; 

RL J. Bacteriol. 181:4176-4184(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-1021; 

RX MEDLINE=21396507; PubMed=114 81430 ; 

RA Capela D., Barloy-Hubler F. , Gouzy J., Bothe G., Ampe F., Batut J., 

RA Boistard P., Becker A,, Boutry M., Cadieu E., Dreano S., Gloux S., 

RA Godrie T., Goffeau A., Kahn D., Kiss E., Lelaure V., Masuy D., 

RA Pohl T., Portetelle D. , Puehler A., Purnelle B., Ramsperger U., 

RA Renard C, Thebault P., Vandenbol M, , Weidner S., Galibert F.; 

RT "Analysis of the chromosome sequence of the legume symbiont 

RT Sinorhizobium meliloti strain 1021,"; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:9877-98 82(2001). 

CC -!- CATALYTIC ACTIVITY: D-glucose 6-phosphate + NADP(+) = D-glucono- 
CC 1,5-lactone 6-phosphate + NADPH. 

CC -!- PATHWAY: Pentose phosphate pathway; first step. 

CC -!- SIMILARITY: Belongs to the glucose-6-phosphate dehydrogenase 

CC family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF045609; AAD12043.1; 

DR EMBL; AL591784; CAC45276.1; -. 

DR HSSP; P11411; IDPG. 

DR InterPro; IPR001282; G6PD. 

DR Pfam; PF00479; G6PD; 1. 

DR Pfam; PF02781; G6PD_C; 1. 

DR PRINTS; PR00079; G6PDHDRGNASE . 

DR ProDom; PD001129; G6PD; 1. 

DR TIGRFAMs; TIGR00871; zwf; 1. 



DR PROSITE; PS00069; G6P_DEHYDR0GENASE; 1. 

KW Oxidoreductase; NADP; Glucose metabolism; Complete proteome. 

FT ACT_SITE 184 184 BY SIMILARITY, 

FT CONFLICT 401 401 R -> T (IN REF. 1) . 

SQ SEQUENCE 491 AA; 55301 MW; 0D8B1AFD094E17 75 CRC64; 

Query Match 49.4%; Score 40; DB 1; Length 491; 

Best Local Similarity 60.0%; Pred. No. 49; 

Matches 6; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 6 VLPVEPFPFV 15 

: : I I I I I : I 
Db 5 IIPVEPFDYV 14 



RESULT 5 
DNAA_MYCPA 

ID DNAA_MYCPA STANDARD; PRT; 509 AA. 

AC Q9L7L7; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel, 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Chromosomal replication initiator protein dnaA. 

GN DNAA. 

OS Mycobacterium paratuberculosis . 

OC Bacteria; Actinobacteria ; Actinobacteridae ; Actinomycetales ; 

OC Corynebacterineae; Mycobacteriaceae; Mycobacterium. 

OX NCBI_TaxID^177 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Zhang Q., Kapur V.; 

RT "Genomic organization of the Mycobacterium avium subsp. 

RT paratuberculosis origin of replication region."; 

RL Submitted (JAN-2000) to the EMBL/GenBank/DDBJ databases. 

CC -!- FUNCTION: Plays an important role in the initiation and regulation 

CC of chromosomal replication. Binds to the origin of replication; it 

CC binds specifically double-stranded DNA at a 9 bp consensus (dnaA 

CC box): 5'-TTATC(C/A)A(C/A)A-3' . DnaA binds to ATP and to acidic 

CC phospholipids (By similarity) . 

CC -!- SIMILARITY: Belongs to the dnaA family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license(3isb-sib . ch) . 

CC 

DR EMBL; AF222789; 7^F33692,1; -. 

DR HAMAP; MF_00377; -; 1. 

DR InterPro; IPR003593; AAA_ATPase. 

DR InterPro; IPR001957; Bac_DnaA. 

DR Pfam; PF00308; bac_dnaA; 1. 

DR PRINTS; PR00051; DNAA. 

DR SMART; SM00382; AAA; 1. 

DR TIGRFAMs; TIGR00362; DnaA; 1. 



DR PROSITE; PS01008; DNAA; 1. 

KW DNA replication; DNA-binding; ATP-binding. 

FT NP^BIND 210 217 ATP (POTENTIAL). 

SQ SEQUENCE 509 AA; 56619 MW; 2 472 F3F64 4D34EC9 CRC64; 

Query Match 49.4%; Score 40; DB 1; Length 509; 

Best Local Similarity 54.5%; Pred. No. 51; 

Matches 6; Conservative 2; Mismatches 3; Indels 0; Gaps 0; 

Qy 3 EMPVLPVEPFP 13 

: I : I I I I I 
Db 109 DAPIPPAEPFP 119 



RESULT 6 




YIOR 


_CVBF 




ID 


YIOR CVBF STANDARD; PRT; 207 AA. 




AC 


P22654; 




DT 


Ol-AUG-1991 (Rel. 19, Created) 




DT 


Ol-AUG-1991 (Rel. 19, Last sequence update) 




DT 


lO-OCT-2003 (Rel. 42, Last annotation update) 




DE 


Hypothetical protein in nucleocapsid ORF (lORF) . 




OS 


Bovine coronavirus (strain F15) (BCoV) (BCV) . 




oc 


Viruses; ssRNA positive-strand viruses, no DNA stage; Nidovirales; 




oc 


Coronaviridae; Coronavirus. 




ox 


NCBI TaxID=11129; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RX 


MEDLINE-89087718; PubMed=32 0750 1 ; 




RA 


Cruciere C. , Laporte J. ; 




RT 


"Sequence and analysis of bovine enteritic coronavirus (F15) genome. 




RT 


I. Sequence of the gene coding for the nucleocapsid protein; analysis 




RT 


of the predicted protein."; 




RL 
CC 
CC 


Ann. Inst. Pasteur Virol. 139:123-138(1988). 




This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 


CC 


the European Bioinf ormatics Institute. There are no restrictions on 


its 


CC 


use by non-profit institutions as long as its content is in no 


way 


CC 


modified and this statement is not removed. Usage by and for commercial 


CC 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 


CC 
CC 
DR 


or send an email to license@isb-sib . ch) . 




EMBL; M36656; AAA42759.1; 




DR 


PIR; S06869; S06869. 




DR 


InterPro; IPR004876; Corona nuci . 




DR 


Pfam; PF03187; Corona I; 1. 




KW 


Hypothetical protein. 




SQ 


SEQUENCE 207 AA; 23001 MW; A4E5DE6117 1BAB50 CRC64; 





Query Match 48.1%; Score 39; DB 1; Length 207; 

Best Local Similarity 54.5%; Pred. No. 28; 

Matches 6; Conservative 2; Mismatches 3; Indels 0; Gaps 0; 
Qy 5 PVLPVEPFPFV 15 



Db 197 PILAIEPLPLV 207 



RESULT 7 
YE35_PYRAB 

ID YE35_PYRAB STANDARD; PRT; 333 AA. 

AC Q9UYS2; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT lO-OCT-2003 (Rel. 42, Last annotation update) 

DE Hypothetical protein PYRAB14350 precursor. 

GN PYRAB14350 OR PAB0953. 

OS Pyrococcus abyssi. 

OC Archaea; Euryarchaeota ; Thermococci; Thermococcales ; Thermococcaceae ; 

OC Pyrococcus. 

OX NCBI_TaxID=29292; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=GE5 / Orsay; 

RX MEDLINE=22511545; PubMed=12622808 ; 

RA Cohen G.N., Barbe V., Flament D., Galperin M. , Heilig R. , Lecompte O., 

RA Poch O., Prieur D., Querellou J., Ripp R. , Thierry J.-C, 

RA Van der Oost J., Weissenbach J., Zivanovic Y., Forterre P.; 

RT "An integrated analysis of the genome of the hyperthermophilic 

RT archaeon Pyrococcus abyssi."; 

RL Mol. Microbiol. 47:14 95-1512(2003). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AJ248287; CAB50340.1; -. 

DR PIR; G75055; G75055. 

DR InterPro; IPR007253; CW_binding_2 . 

DR InterPro; IPR000835; HTH_MarR. 

DR Pfam; PF04122; CW_binding_2 ; 1. 

DR PRINTS; PR00598; HTHMARR. 

KW Hypothetical protein; Transmembrane; Signal; Complete proteome . 

FT SI GNAL 1 2 3 POTENT I AL . 

FT CHAIN 24 333 HYPOTHETICAL PROTEIN PYRAB14350. 

FT TRANSMEM 2 32 252 POTENTIAL. 

SQ SEQUENCE 333 AA; 37598 MW; 5C34 8C3 6EBBD6F14 CRC64; 

Query Match 48.1%; Score 39; DB 1; Length 333; 

Best Local Similarity 60.0%; Pred. No. 47; 

Matches 6; Conservative 3; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 LEMPVLPVEP 11 

I : : I : I I I I 
Db 51 LDIPILPVNP 60 



RESULT 8 
MPK5 HUMAN 



ID MPK5_HUMAN STANDARD; PRT; 438 AA. 

AC Q13163; 

DT Ol-NOV-1997 (Rel. 35, Created) 

DT Ol-NOV-1997 (Rel. 35, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Dual specificity mitogen-activated protein kinase kinase 5 

DE (EC 2.7.1.-) (MAP kinase kinase 5) (MAPKK 5) (MAPK/ERK kinase 5). 

GN MAP2K5 OR PRKMK5 OR MEK5 OR MKK5 . 

OS Homo sapiens (Human) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A., AND MUTAGENESIS. 

RC TISSUE=Fetal brain; 

RX MEDLINE=95279403; PubMed=77 59517 ; 

RA Zhou G., Bao Z.Q., Dixon J.E.; 

RT "Components of a new human protein kinase signal transduction 

RT pathway."; 

RL J. Biol. Chem. 270:12665-12 669(1995). 

CC FUNCTION: INTERACTS SPECIFICALLY WITH ERK5, AND NOT WITH ANOTHER 

CC MAP KINASE LIKE P38. IS NOT PHOSPHORYLATED BY RAFA, RAFB OR 

CC RAFC. MAY INTERACT WITH GTPASES SUCH AS CDC42. 

CC TISSUE SPECIFICITY: EXPRESSED IN MANY ADULT TISSUE. ABUNDANT IN 

CC HEART AND SKELETAL MUSCLE. 

CC -!- PTM: ACTIVATED BY PHOSPHORYLATION ON SER/THR BY MAP KINASE KINASE 
CC KINASES (BY SIMILARITY) . 

CC -!- SIMILARITY: Belongs to the Ser/Thr family of protein kinases. MAP 
CC kinase kinase subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U25265; AAA96146.1; -. 

DR Genew; HGNC:68 45; MAP2K5. 

DR MIM; 602520; -. 

DR GO; GO: 0004672; F;protein kinase activity; TAS . 

DR GO; GO:0007165; P:signal transduction; TAS. 

DR InterPro; IPR000270; 0PR_PB1. 

DR InterPro; IPR000719; Prot_kinase. 

DR InterPro; IPR008271; Ser_thr_pkin_AS . 

DR InterPro; IPR001245; Tyr_pkinase. 

DR Pfam; PF00564; PBl; 1. 

DR Pfam; PF00069; pkinase; 1. 

DR PRINTS; PR00109; TYRKINASE. 

DR ProDom; PDOOOOOl; Prot_kinase; 1. 

DR SMART; SM00666; PBl; 1. 

DR PROSITE; PS00107; PROTEIN_KINASE_ATP ; 1. 

DR PROSITE; PS00108; PROTEIN_KINASE_ST ; 1. 

DR PROSITE; PS50011; PROTEIN_KINASE_DOM; 1. 

KW Transferase; Serine/threonine-protein kinase; Tyrosine-protein kinase; 

KW ATP-binding; Phosphorylation. 



FT 


DOMAIN 


166 


409 


PROTEIN KINASE. 


FT 


NP BIND 


172 


180 


ATP (BY SIMILARITY) 


FT 


BINDING 


195 


195 


ATP. 


FT 


ACT_SITE 


283 


283 


BY SIMILARITY. 


FT 


MOD_RES 


311 


311 


PHOSPHORYLATION . 


FT 


MOD_RES 


315 


315 


PHOSPHORYLATION. 


FT 


MUTAGEN 


195 


195 


K->M: INACTIVATION. 


FT 


MUTAGEN 


311 


311 


S->A: INACTIVATION. 


FT 


MUTAGEN 


315 


315 


T->A: INACTIVATION. 


SQ 


SEQUENCE 


438 AA; 


48968 


MW; 21246312F1640EE2 CRC64; 



Query Match 48.1%; Score 39; DB 1; Length 438; 

Best Local Similarity 58.8%; Pred. No. 63; 

Matches 10; Conservative 1; Mismatches 4; Indels 2; Gap 

Qy 1 DLEMPVLPVEPF — PFV 15 

I : I I I I I I III 
Db 3 67 DEDSPVLPVGEFSEPFV 383 



RESULT 9 






IRF6 


_HUMAN 






ID 


IRF6 HUMAN STANDARD; PRT; 467 AA. 






AC 


014896; 






DT 


15-JUL-1998 (Rel. 36, Created) 






DT 


15-JUL-1998 (Rel. 36, Last sequence update) 






DT 


lO-OCT-2003 (Rel. 42, Last annotation update) 






DE 


Interferon regulatory factor 6 (IRF-6) . 






GN 


IRF6. 






OS 


Homo sapiens (Human) . 






OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 




OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 






OX 


NCBI TaxID=9606; 






RN 


[1] 






RP 


SEQUENCE FROM N.A. 






RA 


Grossman A., Mittrucker H.W., Antonio L, , Ozato K. , Mak T.W.; 






RL 


Submitted (SEP-1997) to the EMBL/GenBank/DDB J databases. 






RN 


[2] 






RP 


SEQUENCE FROM N.A. 






RA 


Graf ham D. ; 






RL 


Submitted (JUN-1998) to the EMBL/GenBank/DDBJ databases. 






RN 


[3] 






RP 


SEQUENCE FROM N.A. 






RC 


TISSUE^Placenta; 






RX 


MEDLINE=22388257; PubMed=12477 932 ; 






RA 


Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 






RA 


Klausner R.D., Collins F.S., Wagner L., Shenmen CM,, Schuler G. 


D 


r 


RA 


Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K. 






RA 


Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 






RA 


Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 






RA 


Stapleton M. , Scares M.B., Bonaldo M.F., Casavant T.L., Scheetz 


T. 


E. 


RA 


Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C 


' • r 




RA 


Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy 


S. 


J. 


RA 


Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P. 


H. 


r 


RA 


Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk 


S. 


w. 


RA 


Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 






RA 


Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez 


A. 



RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman Green E.D., Dickson 

RA Rodriguez A.C., Grimwood J,, Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 
RN [4] 

RP VARIANTS VWS VAL-2; ALA-18; MET-18; ALA-39; GLY-61; ARG-70; SER-76; 

RP HIS-88; GLY-90; HIS-98; GLN-250; 7VRG-273; 290-PHE— ASP-296 DELINS LEU; 

RP PRO-294; ILE-297; GLU-320; MET-321; GLU-325; PRO-345; PHE-347; 

RP SER-369; TRP-374 AND GLU-388, VARIANTS PPS GLY-60; THR-66; LYS-82; 

RP CYS-84; HIS-84; GLU-89 AND ASN-430, AND VARIANT ILE-274. 

RX MEDLINE=22242581; PubMed=122 1 9 0 90 ; 

RA Kondo S., Schutte B.C., Richardson R.J., Bjork B.C., Knight A.S., 

RA Watanabe Y. , Howard E., de Lima R.L.L., Daack-Hirsch S., Sander A., 

RA McDonald-McGinn D.N., Zackai E.H., Lammer E.J., Aylsworth A.S., 

RA Ardinger H.H., Lidral A.C., Pober B.R., Moreno L., Arcos-Burgos M. , 

RA Valencia C, Houdayer C, Bahuau M. , Moretti-Ferreira D., 

RA Richieri-Costa A., Dixon M.J., Murray J.C.; 

RT "Mutations in IRF6 cause Van der Woude and popliteal pterygium 

RT syndromes . " ; 

RL Nat. Genet. 32:285-289(2002). 

CC SUBCELLULAR LOCATION: Nuclear (Potential). 

CC DISEASE: Defects in IRF6 are a cause of van der Woude syndrome 

(VWS) [MIM: 119300] ; also known as lip-pit syndrome (LPS). It is an 
CC autosomal dominant developmental disorder characterized by lower 

CC lip pits, cleft lip and/or cleft palate. Penetrance is incomplete. 

CC Van der Woude and popliteal pterygium syndrome are allelic 

CC disorders. 

CC -!- DISEASE: Defects in IRF6 are the cause of popliteal pterygium 

CC syndrome (PPS) [MIM: 119500]. PPS is an autosomal dominant 

CC developmental disorder characterized by cleft lip and/or cleft 

CC palate, and skin and genital anomalies. Penetrance is incomplete 

CC and expressivity is variable. It shows orofacial phenotypic 

CC similarities with van der Woude syndrome. Van der Woude and 

CC popliteal pterygium syndrome are allelic disorders. 

CC -!- SIMILARITY: Belongs to the IRF family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license(3isb-sib . ch) . 



CC 

DR EMBL; AF027292; AAB84111.1; 

DR EMBL; AL022398; CAA18545.1; -. 

DR EMBL; BC014852; AAH14852.1; -. 

DR HSSP; P23906; 2IRF. 

DR TRANSFAC; T05118; -. 

DR Genew; HGNC:6121; IRF6. 

DR MIM; 607199; -. 

DR MIM; 119300; 

DR MIM; 119500; -. 



DR 

DR 

DR 

DR 

DR 

DR 

DR 

KW 

KW 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 



InterPro; IPR001346; IRF. 
InterPro; IPR008984; SMAD_FHA. 
Pfam; PF00605; IRF; 1. 
PRINTS; PR002 67; INTFRNREGFCT . 
ProDom; PD002355; IRF; 1. 
SMART; SM00348; IRF; 1. 
PROSITE; PS00601; IRF; 1. 

Transcription regulation; DNA-binding; Nuclear protein; Polymorphism; 
Disease mutation. 

Ill TRYPTOPHAN PENTAD REPEAT. 

2 A -> V (in VWS) , 

/ FTId=VAR_0 1 4 9 6 1 . 
18 V -> A (in VWS) . 

/FTId=VAR_014962. 
18 V -> M (in VWS) . 

/FTId=VAR_014963. 
39 P -> A (in VWS) . 

/FTId=VAR_014964 . 

60 W -> G (in PPS) . 
/FTId=VAR_014 965. 

61 A -> G (in VWS) . 
/FTId-VAR__014966. 

66 K -> T (in PPS) . 

/FTId-VAR_014967. 
70 G -> R (in VWS) . 

/FTId=VAR_014968 . 
76 P -> S (in VWS) . 

/FTId=VAR_014969. 
82 Q -> K (in PPS) . 

/ FT I d= VAR__ 0 1 4 9 7 0 . 
84 R -> C (in PPS) . 

/FTId=VAR_014971. 
84 R -> H (in PPS) . 

/FTId-VAR_014 972. 

88 N -> H (in VWS) . 
/FTId=VAR_014 973. 

89 K -> E (in PPS) . 
/FTId=VAR_014 974. 

90 S -> G (in VWS) . 
/FTId=VAR_014 975. 

98 D -> H (in WS) . 

/FTId=VAR_014 976. 
250 R -> Q (in VWS) . 

/FTId=VAR_014977 . 

273 Q -> R (in VWS) . 
/FTId=VAR_014 97 8 . 

274 V -> I (common polymorphism; 3% in 
European-descended and 22% in Asian 
populations ) . 
/FTId=VAR_014979. 

296 FTSKLLD -> L (in VWS) . 
/ FTId-VAR_0 1 4 9 8 0 . 

294 L -> P (in VWS) . 

/FTId=VAR_014981. 

297 V -> I (in VWS) . 
/FTId=VAR_014 982. 

320 K -> E (in VWS) . 



DNA_BIND 
VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 

VARIANT 



VARIANT 



VARIANT 



VARIANT 



VARIANT 



9 
2 



18 



39 



60 



61 



66 



70 



76 



82 



84 



84 



88 



89 



90 



98 



250 



273 



274 



290 



294 



297 



320 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



/FTId=VAR_014983. 
VARIANT 321 321 V -> M (in VWS) . 

/FTId=VAR_014 984, 
VARIANT 325 325 G -> E (in VWS) . 

/FTId=VAR_014985. 
VARIANT 345 345 L -> P (in VWS) . 

/FTId-VAR_014986. 
VARIANT 347 347 C -> F (in VWS) . 

/FTId=VAR_014987, 
VARIANT 369 369 F -> S (in VWS). 

/FTId=VAR_01498 8. 
VARIANT 374 374 C -> W (in VWS). 

/ FT I d= VAR_0 1498 9. 
VARIANT 388 388 K -> E (in VWS). 

/ FTId=VAR_0 14 9 9 0 . 
VARIANT 430 430 D -> N (in PPS) . 

/FTId=VAR__014991. 
SEQUENCE 467 AA; 53129 MW; 7E2 8F5E0F5BA4 053 CRC64; 



Query Match 48.1%; Score 39; DB 1; Length 467; 

Best Local Similarity 41.7%; Pred. No. 67; 

Matches 5; Conservative 5; Mismatches 2; Indels 

Qy 1 DLEMPVLPVEPF 12 

: : I : I I :: I I 
Db 199 EMEVPQAPIQPF 210 



0; Gaps 



0; 



RESULT 10 
IRF6_M0USE 

ID IRF6_M0USE STANDARD; PRT; 4 67 AA, 

AC P97431; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Interferon regulatory factor 6 (IRF-6) . 

GN IRF6. 

OS Mus musculus (Mouse) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/c; TISSUE=Colon; 

RA Grossman A,, Mittrucker H.W., Antonio L., Mak T.W.; 

RL Submitted (OCT-1996) to the EMBL/GenBank/DDBJ databases 

RN [2] 

RP TISSUE SPECIFICITY. 

RX MEDLINE=22242581; PubMed-12219090 ; 

RA Kondo S., Schutte B.C., Richardson R.J., Bjork B.C., Knight A.S,, 

RA Watanabe Y. , Howard E., de Lima R.L.L., Daack-Hirsch S., Sander A., 

RA McDonald-McGinn D.M., Zackai E.H., Lammer E.J., Aylsworth A.S., 

RA Ardinger H.H., Lidral A.C., Pober B.R., Moreno L., Arcos-Burgos M., 

RA Valencia C, Houdayer C, Bahuau M. , Moretti-Ferreira D., 

RA Richieri-Costa A., Dixon M.J., Murray J.C.; 

RT "Mutations in IRF6 cause Van der Woude and popliteal pterygium 

RT syndromes."; 



RL 
CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
SQ 



Nat. Genet. 32:2 85-289(2002). 

SUBCELLULAR LOCATION: Nuclear (Potential). 

TISSUE SPECIFICITY: High levels of expression along the medial 
edge of the fusing palate, tooth buds, hair follicles, genitalia 
and skin. 

SIMILARITY: Belongs to the IRF family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license^isb-sib . ch) . 

EMBL; U73029; AAB36714.1; 
HSSP; P23906; 2IRF. 
TRANSFAC; T05119; -. 
MGD; MGI:1859211; Irf6. 
InterPro; IPR001346; IRF. 
InterPro; IPR008984; SMAD_FHA. 
Pfam; PF00605; IRF; 1. 
PRINTS; PR00267; INTFRNREGFCT . 
ProDom; PD002355; IRF; 1. 
SMART; SM00348; IRF; 1. 
PROSITE; PS00601; IRF; 1. 
Transcription regulation; 
DNA BIND 9 111 



DNA-binding; Nuclear protein. 
TRYPTOPHAN PENTAD REPEAT. 
SEQUENCE 467 AA; 53106 MW; 68CCAA90680FEDC8 CRC64; 



Qy 

Db 



Query Match 4 8.1%; 

Best Local Similarity 41.7%; 
Matches 5; Conservative 

1 DLEMPVLPVEPF 12 



Score 39; DB 1; 
Pred. No. 67; 
5 ; Mismatches 



Length 467; 
2; Indels 



0; Gaps 



0; 



::|:| |::|| 
199 EMEVPQAPIQPF 210 



RESULT 11 
ACH1_YEAST 

ID ACH1_YEAST STANDARD; PRT; 526 AA 

AC P32316; 

DT Ol-OCT-1993 (Rel. 27, Created) 

DT Ol-OCT-1993 (Rel. 27, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Acetyl-CoA hydrolase (EC 3.1.2.1) (Acetyl-CoA deacylase) (Acetyl-CoA 

DE acylase) . 

GN ACHl OR YBL015W OR YBL0304 OR YBL03.18. 

OS Saccharomyces cerevisiae (Baker's yeast). 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina ; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae ; Saccharomyces. 

OX NCBI_TaxID=4932; 

RN [1] 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 

RX MEDLINE=90237039; PubMed=l 97 0569 ; 

RA Lee F.-J.S., Lin L.-W., Smith J.A. ; 



RT 
RT 



RT "A glucose-repressible gene encodes acetyl-CoA hydrolase from 

RT Saccharomyces cerevisiae . " ; 

RL J. Biol. Chem. 265:7413-7418(1990). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=S288c; 

RX MEDLINE=93070614; PubMed=1441754 ; 

RA van Dyck L. , Purnelle B. , Skala J,, Goffeau A. ; 

"An 11.4 kb DNA segment on the left arm of yeast chromosome II 
carries the carboxypeptidase Y sorting gene PEPl, as well as ACHl 

RT FUSS and a putative ARS."; 

RL Yeast 8:7 69-776(1992). 

CC FUNCTION: PRESUMABLY INVOLVED IN REGULATING THE INTRACELLULAR 

CC ACETYL-COA POOL FOR FATTY ACID AND CHOLESTEROL SYNTHESIS AND 

CC FATTY ACID OXIDATION. IT MAY BE INVOLVED IN OVERALL REGULATION 

CC OF ACETYLATION DURING MELATONIN SYNTHESIS. 

CC -!- CATALYTIC ACTIVITY: Acetyl-CoA 4- H(2)0 = CoA + acetate. 

CC -!- SUBUNIT: Monomer. 

CC -!- PTM: Glycosylated; contains mannose. 

CC SIMILARITY: TO N , CRASSA ACU-8 , AND SOME, TO C . KLUYVERI CATl. 



CC 



DR 
DR 



CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use^ by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; M31036; AAA34388.1; 

DR EMBL; X68577; CAA48570.1; 

DR EMBL; Z35776; CAA84834.1; 

DR PIR; S28549; S28549. 

DR GermOnline; 138450; 

DR SGD; SOOOOlll; ACHl. 

DR GO; GO: 0005829; C:cytosol; IDA. 

DR GO; GO: 0005739; C : mitochondrion ; IDA. 

GO; GO:0003986; F: acetyl-CoA hydrolase activity; IDA. 
GO; GO: 0006083; P:acetate metabolism; IMP. 

DR InterPro; IPR003702; ActCoA_hydro . 

DR Pfam; PF02550; AcetylCoA_hydro; 1. 

KW Hydrolase; Glycoprotein. 

FT CONFLICT 308 308 L -> F (IN REF. 2). 

FT CONFLICT 320 320 S -> A (IN REF. 2). 

FT CONFLICT 363 364 FP -> LG (IN REF. 2). 

SQ SEQUENCE 526 AA; 58768 MW; C0C61904F2196A9D CRC64; 

Query Match 48.1%; Score 39; DB 1; Length 52 6; 

Best Local Similarity 41.2%; Pred. No. 76; 

Matches 7; Conservative 6; Mismatches 2; Indels 2; Gaps 1; 

Qy 1 DLEMPVLPV — EPFPFV 15 

I :|:|:: 
Db 191 DIDMPVNPPFRKPYPYL 207 



RESULT 12 



GRBE_HUMAN 

ID GRBE_HUMAN STANDARD; PRT; 540 AA. 

AC Q14449; 

DT 15-JUL-1999 (Rel. 38, Created) 

DT 15-JUL-1999 (Rel. 38, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Growth factor receptor-bound protein 14 (GRB14 adapter protein) . 

GN GRB14. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo 

OX NCBI_TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-96218175; PubMed=8 647 8 58 ; 

RA Daly R.J., Sanderson G.M. , Janes P.W., Sutherland R.L.; 

RT "Cloning and characterization of GRB14, a novel member of the GRB7 

RT gene family."; 

RL J. Biol. Chem. 271:12502-12510(1996). 

CC FUNCTION: Interacts with the cytoplasmic domain of the 

CC autophosphorylated insulin receptor which is then inhibited. The 

CC interaction is mediated by the SH2 domain (By similarity) . 

CC -!- SUBUNIT: Binds to the ankyrin repeat region of TNKS2 via its N- 

CC terminus . 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic, associated with the Golgi and 
CC endosomes. 

CC TISSUE SPECIFICITY: Expressed at high levels in the liver, kidney, 

CC pancreas, testis, ovary, heart and skeletal muscle. 

CC -!- PTM: Phosphorylated on serine residues. 

CC SIMILAJ^ITY: Contains 1 PH domain. 

CC SIMILARITY: Contains 1 Ras-associating domain. 

CC SIMILARITY: Contains 1 SH2 domain. 

CC -!- SIMILARITY: Belongs to the GRB7/10/14 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license(3isb-sib . ch) . 

CC 

DR EMBL; L76687; AAC15861.1; -. 

DR HSSP; P35235; lAYA. 

DR Genew; HGNC:4565; GRB14. 

DR MIM; 601524; -. 

DR GO; GO: 0005070; F:SH3/SH2 adaptor protein activity; TAS . 

DR GO; GO: 0007165; P:signal transduction; TAS. 

DR InterPro; IPR001849; PH. 

DR InterPro; IPR000159; RA__domain. 

DR InterPro; IPR000980; SH2 . 

DR Pfam; PF00169; PH; 1. 

DR Pfam; PF00788; RA; 1. 

DR Pfam; PF00017; SH2; 1. 

DR PRINTS; PR00401; SH2D0MAIN. 

DR ProDom; PD000093; SH2 ; 1. 

DR SMART; SM00233; PH; 1. 



DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
SQ 



SMART; SM00314; RA; 1. 
SMART; SM00252; SH2; 1. 
PROSITE; PS50003; PH_DOMAIN; 1. 
PROSITE; PS50200; RA; 1. 
PROSITE; PS50001; SH2; 1. 
SH2 domain; Phosphorylation. 
DOMAIN 106 192 

DOMAIN 234 342 

DOMAIN 43 9 535 

SEQUENCE 540 AA; 60954 MW; 



RAS-ASSOCIATING . 
PH. 
SH2. 

A8FCFC16D7437B47 CRC64; 



Query Match 48.1%; 
Best Local Similarity 50.0%; 
Matches 7; Conservative 

Qy 1 DLEMPVLPVEPFPF 14 

:| III 

Db 69 DLDVPEMPSIPNPF 82 



Score 39; DB 1; 
Pred. No. 78; 
3; Mismatches 



Length 54 0; 
4; Indels 



0 ; Gap s 



0; 



RESULT 13 
LE12 RALSO 



ID 

AC 

DT 

DT 

DT 

DE 

DE 

GN 

OS 

OG 

OC 

OC 

OX 

RN 

RP 

RC 

RX 

RA 

RA 

RA 

RA 

RA 

RA 

RT 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC 



LE12_RALS0 
Q8XSZ5; 
28-FEB-2003 
28-FEB-2003 
lO-OCT-2003 



STANDARD; 



PRT; 



576 AA. 



{Rel. 41, Created) 
(Rel. 41, Last sequence update) 
(Rel, 42, Last annotation update) 
2-isopropylmalate synthase 2 (EC 2.3.3.13) (Alpha-isopropylmalate 
synthase 2) (Alpha-IPM synthetase 2). 
LEUA2 OR RSP0322 OR RS05445. 

Ralstonia solanacearum (Pseudomonas solanacearum) . 
Plasmid megaplasmid. 

Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales ; 
Burkholderiaceae; Ralstonia . 
NCBI_TaxID=3 05; 
[1] 

SEQUENCE FROM N.A. 
STRAIN=GMI1000; 

MEDLINE=21681879; PubMed=11823852 ; 

Salanoubat M,, Genin S,, Artiguenave F., Gouzy J., Mangenot S., 
Arlat M., Billault A., Brottier P., Camus J.C., Cattolico L.,"' 
Chandler M. , Choisne N., Claudel-Renard C, Cunnac S., Demange N., 
Gaspin C, Lavie M. , Moisan A., Robert C, Saurin W., Schiex T., 
Siguier P., Thebault P., Whalen M. , Wincker P., Levy M. , 
Weissenbach J., Boucher C.A.; 

"Genome sequence of the plant pathogen Ralstonia solanacearum."; 
Nature 415:4 97-502(2002). 



Catalyzes the condensation of the acetyl group of 



FUNCTION: 

acetyl-CoA with 3-methyl-2-oxobutanoate (2-oxoisovalerate) to form 
3-carboxy-3-hydroxy-4-methylpentanoate ( 2-isopropylmalate) 
CATALYTIC ACTIVITY: Acetyl-CoA + 3-methyl-2-oxobutanoate + H(2)0 - 
2-hydroxy-2-isopropylsuccinate + CoA. 
PATHWAY: Leucine biosynthesis; first step. 
SUBUNIT: Homotetramer (By similarity) . 

SIMILARITY: Belongs to the alpha-IPM synthetase / homocitrate 
synthase family. LeuA 2 subfamily. 
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This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; AL646078; CAD17473.1; 

HAMAP; MF_00572; 1. 

InterPro; IPR002034; AIPM/Hcit_synth . 

InterPro; IPR000891; HMGL-like. 

InterPro; IPR005668; LeuA_yeast. 

Pfam; PF00682; HMGL-like; 1. 

TIGRFAMs; TIGR00970; leuA_yeast; 1. 

PROSITE; PS00815; AIPM_H0M0CIT_SYNTH_1 ; 1. 

PROSITE; PS00816; AI PM_H0M0CIT_SYNTH_2 ; 1. 

Leucine biosynthesis; Transferase; pTasmid; Complete proteome. 
SEQUENCE 576 AA; 63149 MW; BBCB0A9A66BA332B CRC64; 



Query Match 4 8.1%; Score 39; DB 1; 

Best Local Similarity 66.7%; Pred. No. 84; 
Matches 6; Conservative 2; Mismatches 

Qy 3 EMPVLPVEP 11 

III l|::| 
Db 356 EMPYLPIDP 364 



Length 576; 
1; Indels 



0; Gaps 



0; 



RESULT 14 
HUTU_HUMAN 

ID HUTU_HUMAN STANDARD; PRT; 67 6 AA 

AC Q96N76; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Probable urocanate hydratase (EC 4.2.1.49) (Urocanase) 

DE (Imidazolonepropionate hydrolase) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Liver; 

RA Tashiro H., Yamazaki M. , Watanabe K., Kumagai A., Itakura S., 

RA Fukuzumi Y., Fujimori Y., Komiyama M. , Sugiyama T., Irie R 

RA Otsuki T., Sato H., Ota T., Wakamatsu A., Ishii S., Yamamoto J., 

RA Isono Y., Kawai-Hio Y., Saito K., Nishikawa T., Kimura K. , 

RA Yamashita H., Matsuo K., Nakamura Y., Sekine M. , Kikuchi H., Kanda K. , 

RA Wagatsuma M., Murakawa K., Kanehori K. , Takahashi-Fuj ii A., Oshima A.' 

RA Sugiyama A., Kawakami B., Suzuki Y., Sugano S., Nagahari K. , 

RA Masuho Y., Nagai K. , Isogai T.; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (OCT-2001) to the EMBL/GenBank/DDB J databases 

CC -!- CATALYTIC ACTIVITY: 3- ( 5-oxo-4 , 5-dihydro-3-H-imidazol-4- 
yDpropanoate = urocanate + H(2)0. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
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cc 

DR 
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DR 
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DR 
DR 
DR 
KW 
SQ 



- COFACTOR: NAD (By similarity) . 

- PATHWAY: Histidine degradation; second step. 

- SIMILARITY: Belongs to the urocanase family. 



This SWISS-PROT entry is copyright. It is produced through 
between the Swiss Institute of Bioinf ormatics and the 
the European Bioinf ormatics Institute. 



a collaboration 
EMBL outstation - 
There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; AK055862; BAB71032.1; -. 
GK; Q96N76; 
MIM; 276880; -. 

InterPro; IPR000193; Urocanase. 
Pfam; PF01175; Urocanase; 1. 
ProDom; PD025423; Urocanase; 1. 
PROSITE; PS01233; UROCANASE; 1. 

Hypothetical protein; Histidine metabolism; Lyase; NAD. 
SEQUENCE 676 AA; 74830 MW; C940D3D06864 8D17 CRC64; 



Query Match 48.1%; Score 39; DBl; Length 676; 

Best Local Similarity 46.2%; Pred. No. le+02; 

Matches 6; Conservative 3; Mismatches 4; Indels 

Qy 1 DLEMPVLPVEPFP 13 

I : I I I : I : I 
Db 82 DIEMRAYPIEQYP 94 



0; Gaps 



0; 



RESULT 15 
PPK_SYNPX 

ID PPK_SYNPX STANDARD; PRT ; 712 AA. 

AC Q7U3D7; 

DT 15-MAR-2004 (Rel. 43, Created) 

DT 15-MAR-2004 (Rel. 43, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Polyphosphate kinase (EC 2.7.4.1) ( Polyphosphoric acid kinase) (ATP- 

DE polyphosphate phosphotransferase) . 

GN PPK OR SYNW2495. 

OS Synechococcus sp, (strain WH8102) . 

OC Bacteria; Cyanobacteria ; Chroococcales ; Synechococcus. 

OX NCBI_TaxID=84588; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22825697; PubMed-12917641 ; 

RA Palenik B., Brahamsha B., Larimer F.W., Land M. , Hauser L., Chain P., 

RA Lamerdin J., Regala W., Allen E.E., McCarren J., Paulsen I., 

RA Dufresne A., Partensky F. , Webb E.A., Waterbury J.; 

RT "The genome of a motile marine Synechococcus."; 

RL Nature 424:1037-1042(2 003). 

CC FUNCTION: Catalyzes the reversible transfer of the terminal 

CC phosphate of ATP to form a long-chain polyphosphate (polyP) 

CC -!- CATALYTIC ACTIVITY: ATP + {phosphate} (N) = ADP + { phosphate } (N+1 ) . 

CC -!- PTM: An intermediate of this reaction is the autophosphorylated 

CC ppk m which a phosphate is covalently linked to histidine 



cc residues through a N-P bond (By similarity) . 

CC SIMILARITY: Belongs to the polyphosphate kinase family, 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; BX569695; CAE09010.1; -. 

DR HAMAP; MF_00347; -; 1. 

DR InterPro; IPR003414; PP_kinase. 

DR Pfam; PF02503; PP_kinase; 1. 

KW Transferase; Phosphorylation; Complete proteome. 

FT ACT_SITE 458 458 PHOSPHOHI STIDINE INTERMEDIATE (BY 

FT SIMILARITY) . 

FT ACT_SITE 477 477 PHOSPHOHI STIDINE INTERMEDIATE (BY 

FT SIMILARITY) . 

SQ SEQUENCE 712 AA; 80943 MW; 451977FE7 1AD95B2 CRC64; 



Query Match 48.1%; Score 39; DB 1; Length 712; 

Best Local Similarity 52.9%; Pred. No. l.le+02; 

Matches 9; Conservative 0; Mismatches 2; Indels 6; Gaps 1; 

Qy 5 PVL PVEPFPFV 15 

III I I II I I 

Db 14 0 PVLTPLAVDPMPFPFV 156 



Search completed: August 24, 2004, 15:43:30 
Job time : 10.0597 sees 



